aboutsummaryrefslogtreecommitdiff
path: root/bh20sequploader/qc_metadata.py
diff options
context:
space:
mode:
authorAndrea Guarracino2020-04-18 22:15:01 +0200
committerGitHub2020-04-18 22:15:01 +0200
commit3bee6777fb4a61febbf1c22e62d71d933cfba4b0 (patch)
treeedd1ae1b9c1080c2e9e28ec49df5fb8d38fe2144 /bh20sequploader/qc_metadata.py
parentbbca5ac9b2538e410efe3e09651f87e5573145de (diff)
downloadbh20-seq-resource-3bee6777fb4a61febbf1c22e62d71d933cfba4b0.tar.gz
bh20-seq-resource-3bee6777fb4a61febbf1c22e62d71d933cfba4b0.tar.lz
bh20-seq-resource-3bee6777fb4a61febbf1c22e62d71d933cfba4b0.zip
new script release
- now the script is more gentle with the server, requesting metadata in batches, reducing the ovrall execution time; - in the YAML files are created field for sample_sequencing_technology, sample_sequencing_technology2, sample_sequencing_technology3, specimen_source, and specimen_source2; - in sequencing_coverage stuff like 'x', 'X', etc... is stripped, and the ',' replaced by '.'; - the script exploits the dictionaries in the /scripts/dict_ontology_standardization. Now I have used ncbi_specesman_source.csv, ncbi_sequencing_technology.csv, and ncbi_countries.csv. - in ncbi_sequencing_technology.csv I've added 'Oxford Nanopore' and 'MinION Oxford Nanopore' - for specimen_source, when there is one of 'NP/OP swab', 'nasopharyngeal and oropharyngeal swab', 'nasopharyngeal/oropharyngeal swab', or 'np/np swab', I put both of them.
Diffstat (limited to 'bh20sequploader/qc_metadata.py')
0 files changed, 0 insertions, 0 deletions