From 7b2d388dbed11384c6a388a5437cca0b8f2914fd Mon Sep 17 00:00:00 2001 From: Pjotr Prins Date: Sun, 19 Jul 2020 09:11:41 +0100 Subject: Wiring up export function --- doc/blog/using-covid-19-pubseq-part1.html | 82 +++++++++++++++++++------------ doc/blog/using-covid-19-pubseq-part1.org | 22 ++++++--- doc/blog/using-covid-19-pubseq-part6.org | 19 ++++++- 3 files changed, 83 insertions(+), 40 deletions(-) (limited to 'doc/blog') diff --git a/doc/blog/using-covid-19-pubseq-part1.html b/doc/blog/using-covid-19-pubseq-part1.html index 0e6136c..5fd86d1 100644 --- a/doc/blog/using-covid-19-pubseq-part1.html +++ b/doc/blog/using-covid-19-pubseq-part1.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
- +This means that when someone uploads a SARS-CoV-2 sequence using one @@ -313,9 +313,8 @@ initiative!
The latest run of the pipeline can be viewed here. Each of these @@ -339,8 +338,8 @@ these identifiers throughout.
To explore an RDF dataset, the first query we can do is open and gets @@ -446,15 +445,18 @@ select (COUNT(distinct ?dataset) as ?num) }
+Run this query. +
-To get dataests with submitters we can do the above +To get datasets with submitters we can do the above
+Run this query. +
+Tells you one submitter is "Roychoudhury,P.;Greninger,A.;Jerome,K." with a URL predicate (http://purl.obolibrary.org/obo/NCIT_C42781) @@ -525,6 +531,10 @@ select distinct ?sid ?sample ?p1 ?dataset ?submitter
+Run query. +
+which shows pretty much everything known about their submissions in this database. Let's focus on one sample "MT326090.1" with predicate @@ -543,21 +553,26 @@ select distinct ?sample ?p ?o
-This query tells us the sample was submitted "2020-03-21" and +Run query. +
+ ++This query tells us the sample was submitted "2020-03-21" and originates from http://www.wikidata.org/entity/Q30, i.e., the USA and is a biospecimen collected from the back of the throat by swabbing. -We can track it back to the original GenBank submission. +We can track it back to the original GenBank submission using the +http://identifiers.org/insdc/MT326090.1 link.
We have also added country and label data to make it a bit easier -to view/query the database. +to view/query the database and place the sequence on the map.
Now we know how to get at the origin we can do it the other way round @@ -574,8 +589,8 @@ and fetch all sequences referring to Washington state
-which lists 300 sequences originating from Washington state! Which is almost -half of the set coming out of GenBank. +which lists 300 sequences originating from Washington state! Which in +April was almost half of the set coming out of GenBank.
@@ -591,12 +606,15 @@ entity is Q43: }
+Run query. +
The public sequence uploader collects sequences, raw data and @@ -607,8 +625,8 @@ referenced in publications and origins are citeable.
The overall effort was due to magnificent freely donated input by a @@ -623,7 +641,7 @@ Garrison this initiative would not have existed!