aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorPjotr Prins2020-07-19 09:11:41 +0100
committerPjotr Prins2020-07-19 09:11:41 +0100
commit7b2d388dbed11384c6a388a5437cca0b8f2914fd (patch)
treef2707c6811948b9c6adc63534ff456266508c109 /doc
parent0e4cb2c14b62ed4f39271c6006a99cea954fc688 (diff)
downloadbh20-seq-resource-7b2d388dbed11384c6a388a5437cca0b8f2914fd.tar.gz
bh20-seq-resource-7b2d388dbed11384c6a388a5437cca0b8f2914fd.tar.lz
bh20-seq-resource-7b2d388dbed11384c6a388a5437cca0b8f2914fd.zip
Wiring up export function
Diffstat (limited to 'doc')
-rw-r--r--doc/blog/using-covid-19-pubseq-part1.html82
-rw-r--r--doc/blog/using-covid-19-pubseq-part1.org22
-rw-r--r--doc/blog/using-covid-19-pubseq-part6.org19
3 files changed, 83 insertions, 40 deletions
diff --git a/doc/blog/using-covid-19-pubseq-part1.html b/doc/blog/using-covid-19-pubseq-part1.html
index 0e6136c..5fd86d1 100644
--- a/doc/blog/using-covid-19-pubseq-part1.html
+++ b/doc/blog/using-covid-19-pubseq-part1.html
@@ -3,7 +3,7 @@
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
-<!-- 2020-07-17 Fri 05:05 -->
+<!-- 2020-07-19 Sun 02:32 -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>COVID-19 PubSeq (part 1)</title>
@@ -248,20 +248,20 @@ for the JavaScript code in this tag.
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
-<li><a href="#org0db5db0">1. What does this mean?</a></li>
-<li><a href="#orge5267fd">2. Fetch sequence data</a></li>
-<li><a href="#orgfbd3adc">3. Predicates</a></li>
-<li><a href="#org08e70e1">4. Fetch submitter info and other metadata</a></li>
-<li><a href="#org9194557">5. Fetch all sequences from Washington state</a></li>
-<li><a href="#org76317ad">6. Discussion</a></li>
-<li><a href="#orgeb871a1">7. Acknowledgements</a></li>
+<li><a href="#orgb852bf7">1. What does this mean?</a></li>
+<li><a href="#orge6db105">2. Fetch sequence data</a></li>
+<li><a href="#orgf3b8001">3. Predicates</a></li>
+<li><a href="#org11097b0">4. Fetch submitter info and other metadata</a></li>
+<li><a href="#org4f8467e">5. Fetch all sequences from Washington state</a></li>
+<li><a href="#orge9b18e2">6. Discussion</a></li>
+<li><a href="#orga0badf8">7. Acknowledgements</a></li>
</ul>
</div>
</div>
-<div id="outline-container-org0db5db0" class="outline-2">
-<h2 id="org0db5db0"><span class="section-number-2">1</span> What does this mean?</h2>
+<div id="outline-container-orgb852bf7" class="outline-2">
+<h2 id="orgb852bf7"><span class="section-number-2">1</span> What does this mean?</h2>
<div class="outline-text-2" id="text-1">
<p>
This means that when someone uploads a SARS-CoV-2 sequence using one
@@ -313,9 +313,8 @@ initiative!
</div>
</div>
-
-<div id="outline-container-orge5267fd" class="outline-2">
-<h2 id="orge5267fd"><span class="section-number-2">2</span> Fetch sequence data</h2>
+<div id="outline-container-orge6db105" class="outline-2">
+<h2 id="orge6db105"><span class="section-number-2">2</span> Fetch sequence data</h2>
<div class="outline-text-2" id="text-2">
<p>
The latest run of the pipeline can be viewed <a href="https://workbench.lugli.arvadosapi.com/collections/lugli-4zz18-z513nlpqm03hpca">here</a>. Each of these
@@ -339,8 +338,8 @@ these identifiers throughout.
</div>
</div>
-<div id="outline-container-orgfbd3adc" class="outline-2">
-<h2 id="orgfbd3adc"><span class="section-number-2">3</span> Predicates</h2>
+<div id="outline-container-orgf3b8001" class="outline-2">
+<h2 id="orgf3b8001"><span class="section-number-2">3</span> Predicates</h2>
<div class="outline-text-2" id="text-3">
<p>
To explore an RDF dataset, the first query we can do is open and gets
@@ -446,15 +445,18 @@ select (COUNT(distinct ?dataset) as ?num)
}
</pre>
</div>
+
+<p>
+Run this <a href="http://sparql.genenetwork.org/sparql/?default-graph-uri=&amp;query=PREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0Aselect+%28COUNT%28distinct+%3Fdataset%29+as+%3Fnum%29%0D%0A%7B%0D%0A+++%3Fdataset+pubseq%3Asubmitter+%3Fid+.%0D%0A+++%3Fid+%3Fp+%3Fsubmitter%0D%0A%7D&amp;format=text%2Fhtml&amp;timeout=0&amp;debug=on&amp;run=+Run+Query+">query</a>.
+</p>
</div>
</div>
-
-<div id="outline-container-org08e70e1" class="outline-2">
-<h2 id="org08e70e1"><span class="section-number-2">4</span> Fetch submitter info and other metadata</h2>
+<div id="outline-container-org11097b0" class="outline-2">
+<h2 id="org11097b0"><span class="section-number-2">4</span> Fetch submitter info and other metadata</h2>
<div class="outline-text-2" id="text-4">
<p>
-To get dataests with submitters we can do the above
+To get datasets with submitters we can do the above
</p>
<div class="org-src-container">
@@ -468,6 +470,10 @@ select distinct ?dataset ?p ?submitter
</div>
<p>
+Run this <a href="http://sparql.genenetwork.org/sparql/?default-graph-uri=&amp;query=PREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0Aselect+distinct+%3Fdataset+%3Fp+%3Fsubmitter%0D%0A%7B%0D%0A+++%3Fdataset+pubseq%3Asubmitter+%3Fid+.%0D%0A+++%3Fid+%3Fp+%3Fsubmitter%0D%0A%7D&amp;format=text%2Fhtml&amp;timeout=0&amp;debug=on&amp;run=+Run+Query+">query</a>.
+</p>
+
+<p>
Tells you one submitter is "Roychoudhury,P.;Greninger,A.;Jerome,K."
with a URL <a href="http://purl.obolibrary.org/obo/NCIT_C42781">predicate</a> (<a href="http://purl.obolibrary.org/obo/NCIT_C42781">http://purl.obolibrary.org/obo/NCIT_C42781</a>)
explaining "The individual who is responsible for the content of a
@@ -526,6 +532,10 @@ select distinct ?sid ?sample ?p1 ?dataset ?submitter
</div>
<p>
+Run <a href="http://sparql.genenetwork.org/sparql/?default-graph-uri=&amp;query=%0D%0APREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0Aselect+distinct+%3Fsid+%3Fsample+%3Fp1+%3Fdataset+%3Fsubmitter%0D%0A%7B%0D%0A+++%3Fdataset+pubseq%3Asubmitter+%3Fid+.%0D%0A+++%3Fid+%3Fp+%3Fsubmitter+.%0D%0A+++FILTER%28CONTAINS%28%3Fsubmitter%2C%22Roychoudhury%22%29%29+.%0D%0A+++%3Fdataset+pubseq%3Asample+%3Fsid+.%0D%0A+++%3Fsid+%3Fp1+%3Fsample%0D%0A%7D%0D%0A&amp;format=text%2Fhtml&amp;timeout=0&amp;debug=on&amp;run=+Run+Query+">query</a>.
+</p>
+
+<p>
which shows pretty much <a href="http://sparql.genenetwork.org/sparql/?default-graph-uri=&amp;query=PREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0Aselect+distinct+%3Fsid+%3Fsample+%3Fp1+%3Fdataset+%3Fsubmitter%0D%0A%7B%0D%0A+++%3Fdataset+pubseq%3Asubmitter+%3Fid+.%0D%0A+++%3Fid+%3Fp+%3Fsubmitter+.%0D%0A+++FILTER%28CONTAINS%28%3Fsubmitter%2C%22Roychoudhury%22%29%29+.%0D%0A+++%3Fdataset+pubseq%3Asample+%3Fsid+.%0D%0A+++%3Fsid+%3Fp1+%3Fsample%0D%0A%7D&amp;format=text%2Fhtml&amp;timeout=0&amp;debug=on&amp;run=+Run+Query+">everything known</a> about their submissions in
this database. Let's focus on one sample "MT326090.1" with predicate
<a href="http://semanticscience.org/resource/SIO_000115">http://semanticscience.org/resource/SIO_000115</a>.
@@ -543,21 +553,26 @@ select distinct ?sample ?p ?o
</div>
<p>
-This <a href="http://sparql.genenetwork.org/sparql/?default-graph-uri=&amp;query=PREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0APREFIX+sio%3A+%3Chttp%3A%2F%2Fsemanticscience.org%2Fresource%2F%3E%0D%0Aselect+distinct+%3Fsample+%3Fp+%3Fo%0D%0A%7B%0D%0A+++%3Fsample+sio%3ASIO_000115+%22MT326090.1%22+.%0D%0A+++%3Fsample+%3Fp+%3Fo+.%0D%0A%7D&amp;format=text%2Fhtml&amp;timeout=0&amp;debug=on&amp;run=+Run+Query+">query</a> tells us the sample was submitted "2020-03-21" and
+Run <a href="http://sparql.genenetwork.org/sparql/?default-graph-uri=&amp;query=%0D%0APREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0APREFIX+sio%3A+%3Chttp%3A%2F%2Fsemanticscience.org%2Fresource%2F%3E%0D%0Aselect+distinct+%3Fsample+%3Fp+%3Fo%0D%0A%7B%0D%0A+++%3Fsample+sio%3ASIO_000115+%22MT326090.1%22+.%0D%0A+++%3Fsample+%3Fp+%3Fo+.%0D%0A%7D&amp;format=text%2Fhtml&amp;timeout=0&amp;debug=on&amp;run=+Run+Query+">query</a>.
+</p>
+
+<p>
+This query tells us the sample was submitted "2020-03-21" and
originates from <a href="http://www.wikidata.org/entity/Q30">http://www.wikidata.org/entity/Q30</a>, i.e., the USA and
is a biospecimen collected from the back of the throat by swabbing.
-We can track it back to the original GenBank <a href="http://identifiers.org/insdc/MT326090.1#sequence">submission</a>.
+We can track it back to the original GenBank <a href="http://identifiers.org/insdc/MT326090.1#sequence">submission</a> using the
+<a href="http://identifiers.org/insdc/MT326090.1">http://identifiers.org/insdc/MT326090.1</a> link.
</p>
<p>
We have also added country and label data to make it a bit easier
-to view/query the database.
+to view/query the database and place the sequence on the <a href="http://covid19.genenetwork.org/">map</a>.
</p>
</div>
</div>
-<div id="outline-container-org9194557" class="outline-2">
-<h2 id="org9194557"><span class="section-number-2">5</span> Fetch all sequences from Washington state</h2>
+<div id="outline-container-org4f8467e" class="outline-2">
+<h2 id="org4f8467e"><span class="section-number-2">5</span> Fetch all sequences from Washington state</h2>
<div class="outline-text-2" id="text-5">
<p>
Now we know how to get at the origin we can do it the other way round
@@ -574,8 +589,8 @@ and fetch all sequences referring to Washington state
</div>
<p>
-which lists 300 sequences originating from Washington state! Which is almost
-half of the set coming out of GenBank.
+which lists 300 sequences originating from Washington state! Which in
+April was almost half of the set coming out of GenBank.
</p>
<p>
@@ -591,12 +606,15 @@ entity is <a href="https://www.wikidata.org/wiki/Q43">Q43</a>:
}
</pre>
</div>
+
+<p>
+Run <a href="http://sparql.genenetwork.org/sparql/?default-graph-uri=&amp;query=%0D%0Aselect+%3Fseq+%3Fsample%0D%0A%7B%0D%0A++++%3Fseq+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2Fsample%3E+%3Fsample+.%0D%0A++++%3Fsample+%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FGAZ_00000448%3E+%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ43%3E%0D%0A%7D&amp;format=text%2Fhtml&amp;timeout=0&amp;debug=on&amp;run=+Run+Query+">query</a>.
+</p>
</div>
</div>
-
-<div id="outline-container-org76317ad" class="outline-2">
-<h2 id="org76317ad"><span class="section-number-2">6</span> Discussion</h2>
+<div id="outline-container-orge9b18e2" class="outline-2">
+<h2 id="orge9b18e2"><span class="section-number-2">6</span> Discussion</h2>
<div class="outline-text-2" id="text-6">
<p>
The public sequence uploader collects sequences, raw data and
@@ -607,8 +625,8 @@ referenced in publications and origins are citeable.
</div>
</div>
-<div id="outline-container-orgeb871a1" class="outline-2">
-<h2 id="orgeb871a1"><span class="section-number-2">7</span> Acknowledgements</h2>
+<div id="outline-container-orga0badf8" class="outline-2">
+<h2 id="orga0badf8"><span class="section-number-2">7</span> Acknowledgements</h2>
<div class="outline-text-2" id="text-7">
<p>
The overall effort was due to magnificent freely donated input by a
@@ -623,7 +641,7 @@ Garrison this initiative would not have existed!
</div>
</div>
<div id="postamble" class="status">
-<hr><small>Created by <a href="http://thebird.nl/">Pjotr Prins</a> (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!<br />Modified 2020-07-17 Fri 05:02</small>.
+<hr><small>Created by <a href="http://thebird.nl/">Pjotr Prins</a> (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!<br />Modified 2020-07-19 Sun 02:32</small>.
</div>
</body>
</html>
diff --git a/doc/blog/using-covid-19-pubseq-part1.org b/doc/blog/using-covid-19-pubseq-part1.org
index 0fd5589..9c8a1c0 100644
--- a/doc/blog/using-covid-19-pubseq-part1.org
+++ b/doc/blog/using-covid-19-pubseq-part1.org
@@ -60,7 +60,6 @@ graph in triples. Soon we will at multi sequence alignments (MSA) and
more. Anyone can contribute data, tools and workflows to this
initiative!
-
* Fetch sequence data
The latest run of the pipeline can be viewed [[https://workbench.lugli.arvadosapi.com/collections/lugli-4zz18-z513nlpqm03hpca][here]]. Each of these
@@ -162,10 +161,11 @@ select (COUNT(distinct ?dataset) as ?num)
}
#+end_src
+Run this [[http://sparql.genenetwork.org/sparql/?default-graph-uri=&query=PREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0Aselect+%28COUNT%28distinct+%3Fdataset%29+as+%3Fnum%29%0D%0A%7B%0D%0A+++%3Fdataset+pubseq%3Asubmitter+%3Fid+.%0D%0A+++%3Fid+%3Fp+%3Fsubmitter%0D%0A%7D&format=text%2Fhtml&timeout=0&debug=on&run=+Run+Query+][query]].
* Fetch submitter info and other metadata
-To get dataests with submitters we can do the above
+To get datasets with submitters we can do the above
#+begin_src sql
PREFIX pubseq: <http://biohackathon.org/bh20-seq-schema#MainSchema/>
@@ -176,6 +176,8 @@ select distinct ?dataset ?p ?submitter
}
#+end_src
+Run this [[http://sparql.genenetwork.org/sparql/?default-graph-uri=&query=PREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0Aselect+distinct+%3Fdataset+%3Fp+%3Fsubmitter%0D%0A%7B%0D%0A+++%3Fdataset+pubseq%3Asubmitter+%3Fid+.%0D%0A+++%3Fid+%3Fp+%3Fsubmitter%0D%0A%7D&format=text%2Fhtml&timeout=0&debug=on&run=+Run+Query+][query]].
+
Tells you one submitter is "Roychoudhury,P.;Greninger,A.;Jerome,K."
with a URL [[http://purl.obolibrary.org/obo/NCIT_C42781][predicate]] (http://purl.obolibrary.org/obo/NCIT_C42781)
explaining "The individual who is responsible for the content of a
@@ -223,6 +225,8 @@ select distinct ?sid ?sample ?p1 ?dataset ?submitter
}
#+end_src
+Run [[http://sparql.genenetwork.org/sparql/?default-graph-uri=&query=%0D%0APREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0Aselect+distinct+%3Fsid+%3Fsample+%3Fp1+%3Fdataset+%3Fsubmitter%0D%0A%7B%0D%0A+++%3Fdataset+pubseq%3Asubmitter+%3Fid+.%0D%0A+++%3Fid+%3Fp+%3Fsubmitter+.%0D%0A+++FILTER%28CONTAINS%28%3Fsubmitter%2C%22Roychoudhury%22%29%29+.%0D%0A+++%3Fdataset+pubseq%3Asample+%3Fsid+.%0D%0A+++%3Fsid+%3Fp1+%3Fsample%0D%0A%7D%0D%0A&format=text%2Fhtml&timeout=0&debug=on&run=+Run+Query+][query]].
+
which shows pretty much [[http://sparql.genenetwork.org/sparql/?default-graph-uri=&query=PREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0Aselect+distinct+%3Fsid+%3Fsample+%3Fp1+%3Fdataset+%3Fsubmitter%0D%0A%7B%0D%0A+++%3Fdataset+pubseq%3Asubmitter+%3Fid+.%0D%0A+++%3Fid+%3Fp+%3Fsubmitter+.%0D%0A+++FILTER%28CONTAINS%28%3Fsubmitter%2C%22Roychoudhury%22%29%29+.%0D%0A+++%3Fdataset+pubseq%3Asample+%3Fsid+.%0D%0A+++%3Fsid+%3Fp1+%3Fsample%0D%0A%7D&format=text%2Fhtml&timeout=0&debug=on&run=+Run+Query+][everything known]] about their submissions in
this database. Let's focus on one sample "MT326090.1" with predicate
http://semanticscience.org/resource/SIO_000115.
@@ -237,13 +241,16 @@ select distinct ?sample ?p ?o
}
#+end_src
-This [[http://sparql.genenetwork.org/sparql/?default-graph-uri=&query=PREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0APREFIX+sio%3A+%3Chttp%3A%2F%2Fsemanticscience.org%2Fresource%2F%3E%0D%0Aselect+distinct+%3Fsample+%3Fp+%3Fo%0D%0A%7B%0D%0A+++%3Fsample+sio%3ASIO_000115+%22MT326090.1%22+.%0D%0A+++%3Fsample+%3Fp+%3Fo+.%0D%0A%7D&format=text%2Fhtml&timeout=0&debug=on&run=+Run+Query+][query]] tells us the sample was submitted "2020-03-21" and
+Run [[http://sparql.genenetwork.org/sparql/?default-graph-uri=&query=%0D%0APREFIX+pubseq%3A+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2F%3E%0D%0APREFIX+sio%3A+%3Chttp%3A%2F%2Fsemanticscience.org%2Fresource%2F%3E%0D%0Aselect+distinct+%3Fsample+%3Fp+%3Fo%0D%0A%7B%0D%0A+++%3Fsample+sio%3ASIO_000115+%22MT326090.1%22+.%0D%0A+++%3Fsample+%3Fp+%3Fo+.%0D%0A%7D&format=text%2Fhtml&timeout=0&debug=on&run=+Run+Query+][query]].
+
+This query tells us the sample was submitted "2020-03-21" and
originates from http://www.wikidata.org/entity/Q30, i.e., the USA and
is a biospecimen collected from the back of the throat by swabbing.
-We can track it back to the original GenBank [[http://identifiers.org/insdc/MT326090.1#sequence][submission]].
+We can track it back to the original GenBank [[http://identifiers.org/insdc/MT326090.1#sequence][submission]] using the
+http://identifiers.org/insdc/MT326090.1 link.
We have also added country and label data to make it a bit easier
-to view/query the database.
+to view/query the database and place the sequence on the [[http://covid19.genenetwork.org/][map]].
* Fetch all sequences from Washington state
@@ -258,8 +265,8 @@ select ?seq ?sample
}
#+end_src
-which lists 300 sequences originating from Washington state! Which is almost
-half of the set coming out of GenBank.
+which lists 300 sequences originating from Washington state! Which in
+April was almost half of the set coming out of GenBank.
Likewise to list all sequences from Turkey we can find the wikidata
entity is [[https://www.wikidata.org/wiki/Q43][Q43]]:
@@ -272,6 +279,7 @@ select ?seq ?sample
}
#+end_src
+Run [[http://sparql.genenetwork.org/sparql/?default-graph-uri=&query=%0D%0Aselect+%3Fseq+%3Fsample%0D%0A%7B%0D%0A++++%3Fseq+%3Chttp%3A%2F%2Fbiohackathon.org%2Fbh20-seq-schema%23MainSchema%2Fsample%3E+%3Fsample+.%0D%0A++++%3Fsample+%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FGAZ_00000448%3E+%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ43%3E%0D%0A%7D&format=text%2Fhtml&timeout=0&debug=on&run=+Run+Query+][query]].
* Discussion
diff --git a/doc/blog/using-covid-19-pubseq-part6.org b/doc/blog/using-covid-19-pubseq-part6.org
index 8964700..6ee68bb 100644
--- a/doc/blog/using-covid-19-pubseq-part6.org
+++ b/doc/blog/using-covid-19-pubseq-part6.org
@@ -9,11 +9,26 @@
* Table of Contents :TOC:noexport:
+ - [[#short-version][Short version]]
- [[#generating-output-for-ebi][Generating output for EBI]]
- [[#defining-the-ebi-study][Defining the EBI study]]
- [[#define-the-ebi-sample][Define the EBI sample]]
- [[#define-the-ebi-sequence][Define the EBI sequence]]
+* Short version
+
+PubSeq can export files that can be uploaded to EBI/ENA. This saves
+you work. Steps are:
+
+1. Register and account for EBI/ENA as explained [[https://ena-docs.readthedocs.io/en/latest/submit/general-guide.html][here]].
+2. Register a study online or use XML files discussed below
+3. Export a sample XML and push to EBI/ENA
+4. Zip sequence data and push to EBI/ENA
+
+Because PubSeq's metadata for is richer than the metadata EBI/ENA asks
+for, it is easy to generate and export the forms using the [[http://covid19.genenetwork.org/export][EXPORT]]
+page.
+
* Generating output for EBI
Would it not be great an uploader to PubSeq also can export samples
@@ -81,6 +96,8 @@ also a submission 'command' is required looking like
#+END_SRC
+Working XML examples we tested can be found [[https://github.com/arvados/bh20-seq-resource/tree/master/scripts/submit_ebi/example][here]].
+
The webin system accepts such sources using a command like
: curl -u username:password -F "SUBMISSION=@submission.xml" \
@@ -88,7 +105,7 @@ The webin system accepts such sources using a command like
as described [[https://ena-docs.readthedocs.io/en/latest/submit/study/programmatic.html#submit-the-xmls-using-curl][here]]. Note that this is the test server. For the final
version use www.ebi.ac.uk instead of wwwdev.ebi.ac.uk. You may also
-need the --insecure switch to circumvent certificate checking.
+need the =--insecure= switch to circumvent certificate checking.
/work in progress (WIP)/