index
:
bh20-seq-resource
analysis-refactor
fasta-subset-from-query
generate-cwl
master
new_assembly_method_field
pangenome_workflow_abpoa
upload-download-status
uuid-for-resource
yamlfa2ttl
Tool to upload SARS-CoV-2 sequences to BH20 Arvados instance and orchestrate analysis
about
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
scripts
/
download_genbank_data
Age
Commit message (
Collapse
)
Author
2020-11-13
generation of dates a little more robust
AndreaGuarracino
2020-11-13
to not create YAML files with date before 2019 December
AndreaGuarracino
2020-11-13
fix in the ids to consider
AndreaGuarracino
2020-11-13
added ids-to-consider option to the NCBI script
AndreaGuarracino
2020-11-12
managed the assembly_method in the scripts, doc, and the example templates
AndreaGuarracino
2020-11-12
updated alignment_protocol field in the script, doc, and the example templates
AndreaGuarracino
2020-09-28
new countries; updated genbank/sra scripts to manage more specimen sources
AndreaGuarracino
2020-09-28
genbank and sra scripts more picky on the ontologies; added utils.py for ↵
AndreaGuarracino
shared functions
2020-09-04
added in the sra script an option to include only a subset of ids
AndreaGuarracino
2020-09-04
synchronized the create_sra_metadata.py script with the latest updates
AndreaGuarracino
2020-08-28
added control (locally and in the validation) that sample_id has to be the ↵
AndreaGuarracino
same in the metadata and in the FASTA header #103
2020-08-27
updated dependency from clustalw to minimap2; the genbank script no longer ↵
AndreaGuarracino
creates YAML/FASTA pairs for too short sequences
2020-08-26
added option in the genbank script to ignore (already validated) IDs; code ↵
AndreaGuarracino
cleaning; typos
2020-08-25
the YAML/FASTA pair is not created for samples where at least one mandatory ↵
AndreaGuarracino
field is missing
2020-08-23
genbank/sra scripts update to be more generic with the specimen sources
AndreaGuarracino
2020-08-22
genbank/sra scripts updated to read the dictionaries in a more general way
AndreaGuarracino
2020-07-12
added a suffix to distinguish which script created the error/warning files
AndreaGuarracino
2020-07-10
an output file is created with the accessions for which no YAML file is created
AndreaGuarracino
2020-07-09
fixed bug that lead to invalid sample_sequencing_technology values
Andrea Guarracino
2020-07-07
fix missing authors #91
AndreaGuarracino
2020-07-07
if the technology is not found, the YAML file is not created; managed longer ↵
AndreaGuarracino
species strings
2020-07-06
added seq technology in its additional information field if the term is ↵
AndreaGuarracino
missing in the dicts
2020-07-03
Improving genbank import workflow
Peter Amstutz
2020-06-22
moved the genbank script in his specific directory
AndreaGuarracino