aboutsummaryrefslogtreecommitdiff
path: root/doc/blog/using-covid-19-pubseq-part3.org
diff options
context:
space:
mode:
Diffstat (limited to 'doc/blog/using-covid-19-pubseq-part3.org')
-rw-r--r--doc/blog/using-covid-19-pubseq-part3.org24
1 files changed, 19 insertions, 5 deletions
diff --git a/doc/blog/using-covid-19-pubseq-part3.org b/doc/blog/using-covid-19-pubseq-part3.org
index 1cd2db1..296bef6 100644
--- a/doc/blog/using-covid-19-pubseq-part3.org
+++ b/doc/blog/using-covid-19-pubseq-part3.org
@@ -13,10 +13,24 @@
* Table of Contents :TOC:noexport:
- [[#uploading-data][Uploading Data]]
- - [[#table-of-contents][Table of Contents]]
- - [[#what-does-this-mean][What does this mean?]]
+ - [[#introduction][Introduction]]
+ - [[#step-1-sequence][Step 1: Sequence]]
+ - [[#step-2-metadata][Step 2: Metadata]]
+
+* Introduction
+
+The COVID-19 PubSeq allows you to upload your SARS-Cov-2 strains to a
+public resource for global comparisons. Compute it triggered on
+upload. Read the [[./about][ABOUT]] page for more information.
+
+* Step 1: Sequence
+
+We start with an assembled or mapped sequence in FASTA format. The
+PubSeq uploader contains a [[https://github.com/arvados/bh20-seq-resource/blob/master/bh20sequploader/qc_fasta.py][QC step]] which checks whether it is a likely
+SARS-CoV-2 sequence. While PubSeq deduplicates sequences and never
+overwrites metadata it probably pays to check whether your data
+already is in the system by querying some metadata as described in
+[[./blog?id=using-covid-19-pubseq-part1][Query metadata with SPARQL]].
-* Table of Contents :TOC:noexport:
- - [[#what-does-this-mean][What does this mean?]]
-* What does this mean?
+* Step 2: Metadata