bh20-seq-resource - Tool to upload SARS-CoV-2 sequences to BH20 Arvados instance and orchestrate analysis

Age	Commit message (Collapse)	Author
2020-12-31	genbank: moving script into workflow space	Pjotr Prins

2020-12-30	Ignores	Pjotr Prins

2020-12-30	Genbank: comments	Pjotr Prins

2020-11-15	fix sequencing_coverage	AndreaGuarracino

2020-11-15	added a check on host_age	AndreaGuarracino

2020-11-14	added a check on host_age	AndreaGuarracino

2020-11-13	fix check date sra script	AndreaGuarracino

2020-11-13	generation of dates a little more robust	AndreaGuarracino

2020-11-13	to not create YAML files with date before 2019 December	AndreaGuarracino

2020-11-13	fix in the ids to consider	AndreaGuarracino

2020-11-13	added ids-to-consider option to the NCBI script	AndreaGuarracino

2020-11-12	managed the assembly_method in the scripts, doc, and the example templates	AndreaGuarracino

2020-11-12	updated alignment_protocol field in the script, doc, and the example templates	AndreaGuarracino

2020-09-28	new countries; updated genbank/sra scripts to manage more specimen sources	AndreaGuarracino

2020-09-28	genbank and sra scripts more picky on the ontologies; added utils.py for ↵	AndreaGuarracino
	shared functions
2020-09-04	added in the sra script an option to include only a subset of ids	AndreaGuarracino

2020-09-04	synchronized the create_sra_metadata.py script with the latest updates	AndreaGuarracino

2020-08-28	added control (locally and in the validation) that sample_id has to be the ↵	AndreaGuarracino
	same in the metadata and in the FASTA header #103
2020-08-27	updated dependency from clustalw to minimap2; the genbank script no longer ↵	AndreaGuarracino
	creates YAML/FASTA pairs for too short sequences
2020-08-26	added option in the genbank script to ignore (already validated) IDs; code ↵	AndreaGuarracino
	cleaning; typos
2020-08-25	the YAML/FASTA pair is not created for samples where at least one mandatory ↵	AndreaGuarracino
	field is missing
2020-08-23	genbank/sra scripts update to be more generic with the specimen sources	AndreaGuarracino

2020-08-22	genbank/sra scripts updated to read the dictionaries in a more general way	AndreaGuarracino

2020-07-12	added a suffix to distinguish which script created the error/warning files	AndreaGuarracino

2020-07-10	an output file is created with the accessions for which no YAML file is created	AndreaGuarracino

2020-07-09	fixed bug that lead to invalid sample_sequencing_technology values	Andrea Guarracino

2020-07-07	fix missing authors #91	AndreaGuarracino

2020-07-07	if the technology is not found, the YAML file is not created; managed longer ↵	AndreaGuarracino
	species strings
2020-07-06	added seq technology in its additional information field if the term is ↵	AndreaGuarracino
	missing in the dicts
2020-07-03	Improving genbank import workflow	Peter Amstutz

2020-06-22	moved the genbank script in his specific directory	AndreaGuarracino