diff options
author | AndreaGuarracino | 2021-01-07 23:50:01 +0100 |
---|---|---|
committer | AndreaGuarracino | 2021-01-07 23:50:01 +0100 |
commit | 4d841d279b2bf73da2ba815d53863c7f2861c956 (patch) | |
tree | 83b9ad136dabacbf7ed54e19b2db6df348bef904 /workflows/pull-data/genbank/README.md | |
parent | 141e619929cee17018417d71111063015e73c366 (diff) | |
parent | c080c3cffedcc0cc99496b5e70fcfdf998978f16 (diff) | |
download | bh20-seq-resource-4d841d279b2bf73da2ba815d53863c7f2861c956.tar.gz bh20-seq-resource-4d841d279b2bf73da2ba815d53863c7f2861c956.tar.lz bh20-seq-resource-4d841d279b2bf73da2ba815d53863c7f2861c956.zip |
Merge branch 'master' into yamlfa2ttl
Diffstat (limited to 'workflows/pull-data/genbank/README.md')
-rw-r--r-- | workflows/pull-data/genbank/README.md | 12 |
1 files changed, 10 insertions, 2 deletions
diff --git a/workflows/pull-data/genbank/README.md b/workflows/pull-data/genbank/README.md index 5464d1d..188ff6f 100644 --- a/workflows/pull-data/genbank/README.md +++ b/workflows/pull-data/genbank/README.md @@ -11,7 +11,8 @@ The following workflow sends GenBank data into PubSeq ```sh # --- get list of IDs already in PubSeq -../../tools/sparql-fetch-ids > pubseq_ids.txt +../../tools/pubseq-fetch-ids > pubseq_ids.txt + # --- get list of missing genbank IDs python3 genbank-fetch-ids.py --skip pubseq_ids.txt > genbank_ids.txt @@ -26,6 +27,13 @@ python3 ../../workflows/tools/normalize-yamlfa.py -s ~/tmp/yamlfa/state.json --s ``` +## Validate GenBank data + +To pull the data from PubSeq use the list of pubseq ids generated +above. + + + # TODO -- [ ] Add id for GenBank accession - i.e. how can we tell a record is from GenBank +- [X] Add id for GenBank accession - i.e. how can we tell a record is from GenBank |