Age | Commit message (Collapse) | Author | |
---|---|---|---|
2020-04-30 | Import script fixes | Peter Amstutz | |
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com> | |||
2020-04-30 | Wrap import script to run as a workflow | Peter Amstutz | |
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com> | |||
2020-04-29 | Reverting country list back, something went wrong there | lltommy | |
2020-04-28 | updated to manage list fields and added new control on nasopharyngeal/throat ↵ | Andrea Guarracino | |
swab | |||
2020-04-28 | Updated with new terms | Andrea Guarracino | |
2020-04-28 | Updated with new terms | Andrea Guarracino | |
2020-04-28 | Updated - 1731 IDs - 2020/04/28 | Andrea Guarracino | |
2020-04-28 | Changes to the structure - we use lists now instead of strings where it ↵ | lltommy | |
makes sense. This allows us to have multiple values where in makes sense | |||
2020-04-26 | Updating dics | lltommy | |
2020-04-24 | Add script that updates Virtuoso - run as a CRON job | Pjotr Prins | |
2020-04-23 | Merge pull request #1 from AndreaGuarracino/patch-11 | Andrea Guarracino | |
Patch 11 | |||
2020-04-23 | code cleaning, refactoring, submitter name and address | Andrea Guarracino | |
- additional_submitter_information for information not equal to name or address - added another check for coverage | |||
2020-04-23 | added nasal swab for several new IDs | Andrea Guarracino | |
2020-04-23 | updated IDs list - 2020/04/23 - 1436 IDs | Andrea Guarracino | |
2020-04-23 | Adding a third sequence technology option | lltommy | |
2020-04-23 | Merge pull request #33 from AndreaGuarracino/patch-9 | LLTommy | |
script updating | |||
2020-04-22 | code cleaning, checking and writing missing term on file | Andrea Guarracino | |
- the script checks for country and specimen_source - now the missing terms are written on a tsv file | |||
2020-04-22 | created dict for host health status | Andrea Guarracino | |
2020-04-22 | added some rows in the speciesman dict | Andrea Guarracino | |
2020-04-22 | added some rows in the ncbi_countries dict | Andrea Guarracino | |
2020-04-22 | updated IDs list - 2020/04/22 | Andrea Guarracino | |
2020-04-22 | Small changes all around, trying to make the importer/metadata better | lltommy | |
2020-04-21 | Tweak handling of "coverage" also fix typo | Peter Amstutz | |
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com> | |||
2020-04-21 | Working on NCBI import | Peter Amstutz | |
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com> | |||
2020-04-21 | Updated shex and manditory fields and stuff | lltommy | |
2020-04-20 | Merge pull request #29 from inutano/fix-mappings | LLTommy | |
fix MinION to ONT | |||
2020-04-20 | Merge pull request #28 from AndreaGuarracino/patch-8 | LLTommy | |
fixed missing variable and managed comma in dicts | |||
2020-04-20 | fix MinION to ONT | Tazro Inutano Ohta | |
2020-04-20 | Fixing string -> URI in speciman, plus other things | lltommy | |
2020-04-19 | added 'np/op' control for specimen_source | Andrea Guarracino | |
2020-04-19 | Further updates to our NCBI dictonaries to translate this s*** to our model | lltommy | |
2020-04-19 | fixed missing variable and managed comma in dicts | Andrea Guarracino | |
2020-04-19 | Updating NCBI dictonaries, adding UI options, small yaml schema changes | lltommy | |
2020-04-19 | Merge 984f74f7d7219c83d280b6eee46cba4aed4298bb | Pjotr Prins | |
2020-04-19 | Merge branch 'master' into patch-6 | Pjotr Prins | |
2020-04-19 | Merge pull request #27 from AndreaGuarracino/patch-7 | Pjotr Prins | |
updated term-mapping dictionaries | |||
2020-04-19 | Merge pull request #23 from AndreaGuarracino/patch-3 | Pjotr Prins | |
accessions list CoV-2 from NCBI Virus 2020/04/15 | |||
2020-04-18 | dictionaries for mapping | Andrea Guarracino | |
2020-04-18 | ncbi_speciesman_source mapping | Andrea Guarracino | |
2020-04-18 | Delete dict_ontology_standardization | Andrea Guarracino | |
2020-04-18 | ncbi_speciesman_source mapping | Andrea Guarracino | |
2020-04-18 | new script release | Andrea Guarracino | |
- now the script is more gentle with the server, requesting metadata in batches, reducing the ovrall execution time; - in the YAML files are created field for sample_sequencing_technology, sample_sequencing_technology2, sample_sequencing_technology3, specimen_source, and specimen_source2; - in sequencing_coverage stuff like 'x', 'X', etc... is stripped, and the ',' replaced by '.'; - the script exploits the dictionaries in the /scripts/dict_ontology_standardization. Now I have used ncbi_specesman_source.csv, ncbi_sequencing_technology.csv, and ncbi_countries.csv. - in ncbi_sequencing_technology.csv I've added 'Oxford Nanopore' and 'MinION Oxford Nanopore' - for specimen_source, when there is one of 'NP/OP swab', 'nasopharyngeal and oropharyngeal swab', 'nasopharyngeal/oropharyngeal swab', or 'np/np swab', I put both of them. | |||
2020-04-15 | added type id check | Andrea Guarracino | |
what is not genomic DNA is removed | |||
2020-04-15 | accessions list CoV-2 from NCBI Virus 2020/04/15 | Andrea Guarracino | |
2020-04-14 | accessions list CoV-2 from NCBI Virus 2020/04/14 | Andrea Guarracino | |
2020-04-14 | Rename script/from_genbank_to_fasta_and_yaml.py to ↵ | Andrea Guarracino | |
scripts/from_genbank_to_fasta_and_yaml.py |