aboutsummaryrefslogtreecommitdiff
path: root/scripts
AgeCommit message (Collapse)Author
2020-09-29esr_samples script refactoring; added a reference of the esr_samples script ↵AndreaGuarracino
in the blog as an example of how to parse metadata
2020-09-28new countries; updated genbank/sra scripts to manage more specimen sourcesAndreaGuarracino
2020-09-28genbank and sra scripts more picky on the ontologies; added utils.py for ↵AndreaGuarracino
shared functions
2020-09-27Virtuoso uploader: instructionsPjotr Prins
2020-09-27Virtuoso uploader: explicit outputPjotr Prins
2020-09-27Fixing missing dot in .ttl filelltommy
2020-09-27Adding script supporting semantic enrichmentlltommy
2020-09-26script for processing the metadata of the ESR samples; moved ↵AndreaGuarracino
delete_entries_on_arvados script in scripts directory
2020-09-25added new New Zealand entriesAndreaGuarracino
2020-09-04fixed bugs in in index management and type conversionAndreaGuarracino
2020-09-04sra script re-enabled, ready for testsAndreaGuarracino
2020-09-04added in the sra script an option to include only a subset of idsAndreaGuarracino
2020-09-04sra script updated for managing more locationsAndreaGuarracino
2020-09-04synchronized the create_sra_metadata.py script with the latest updatesAndreaGuarracino
2020-08-29fixed few countries ontology terms; added a new speciesAndreaGuarracino
2020-08-28added control (locally and in the validation) that sample_id has to be the ↵AndreaGuarracino
same in the metadata and in the FASTA header #103
2020-08-27updated dependency from clustalw to minimap2; the genbank script no longer ↵AndreaGuarracino
creates YAML/FASTA pairs for too short sequences
2020-08-26added option in the genbank script to ignore (already validated) IDs; code ↵AndreaGuarracino
cleaning; typos
2020-08-25the YAML/FASTA pair is not created for samples where at least one mandatory ↵AndreaGuarracino
field is missing
2020-08-24fixed protocol for the dictionary entries that caused validation problemsAndreaGuarracino
2020-08-23genbank/sra scripts update to be more generic with the specimen sourcesAndreaGuarracino
2020-08-23added new countries and speciesman sources: fixed few country entriesAndreaGuarracino
2020-08-22genbank/sra scripts updated to read the dictionaries in a more general wayAndreaGuarracino
2020-08-22lots of new dictionary termsAndreaGuarracino
2020-07-17Comment out some broken links for nowPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-07-17Preparing for EBI submissionPjotr Prins
2020-07-17Started EBI submissionPjotr Prins
2020-07-16Report similarity == 0Peter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-07-16Cleanup script also clears errors for revalidatePeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-07-16Catch exceptionsPeter Amstutz
Add script to cleanup bad uploads. Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-07-12added a suffix to distinguish which script created the error/warning filesAndreaGuarracino
2020-07-10metadata with missing host_species are not createdAndreaGuarracino
2020-07-10an output file is created with the accessions for which no YAML file is createdAndreaGuarracino
2020-07-10updated metadata sourceAndreaGuarracino
2020-07-10other term for Homo sapiens (for SRA samples)Andrea Guarracino
2020-07-09fixed bug that lead to invalid sample_sequencing_technology valuesAndrea Guarracino
2020-07-07Merge pull request #90 from AndreaGuarracino/patch-21LLTommy
genbank and sra scripts update, new terms in the ontology dictionaries
2020-07-07fix missing authors #91AndreaGuarracino
2020-07-07minimap2 returns nothing when there is no alignment.Peter Amstutz
2020-07-07if the technology is not found, the YAML file is not created; managed longer ↵AndreaGuarracino
species strings
2020-07-06renamed sra script; added seq technology in its additional information field ↵AndreaGuarracino
if the term …
2020-07-06fix ncbi_countries dictionaryAndreaGuarracino
2020-07-06new terms in the ncbi_countries dictionaryAndrea Guarracino
2020-07-06added seq technology in its additional information field if the term is ↵AndreaGuarracino
missing in the dicts
2020-07-06updated SraExperimentPackage infoAndreaGuarracino
2020-07-06two more terms in the ncbi_sequencing_technology dictionaryAndrea Guarracino
2020-07-06fixed bugs in the download_sra_dataAndrea Guarracino
2020-07-06new terms in the sequencing_technology dictionaryAndrea Guarracino
2020-07-03Add upload.cwlPeter Amstutz
2020-07-03Improving genbank import workflowPeter Amstutz