From fe1402f898e3673aba725df62109df5bb2d8eef4 Mon Sep 17 00:00:00 2001 From: Pjotr Prins Date: Mon, 24 Aug 2020 09:12:31 +0100 Subject: Images --- doc/web/about.html | 153 ++++++++++++++++++++++++++++---------------------- doc/web/about.org | 18 ++++-- doc/web/download.html | 110 +++++++++++++++++++----------------- doc/web/download.org | 5 ++ 4 files changed, 164 insertions(+), 122 deletions(-) (limited to 'doc') diff --git a/doc/web/about.html b/doc/web/about.html index aa12851..2d1f51d 100644 --- a/doc/web/about.html +++ b/doc/web/about.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + About/FAQ @@ -247,34 +247,35 @@ for the JavaScript code in this tag.

Table of Contents

-
-

1 What is the 'public sequence resource' about?

+
+

1 What is the 'public sequence resource' about?

PubSeq, the public sequence resource, aims to provide a generic and @@ -322,8 +323,8 @@ follow.

-
-

2 Presentations

+
+

2 Presentations

We presented at the BOSC 2020 Have a look at the video (alternative @@ -332,8 +333,8 @@ link) and the -

3 Who created the public sequence resource?

+ -
-

4 How does the public sequence resource compare to other data resources?

+
+

4 How does the public sequence resource compare to other data resources?

The short version is that we use state-of-the-art practices in @@ -378,8 +379,8 @@ such as GISAID.

-
-

5 Why should I upload my data here?

+
+

5 Why should I upload my data here?

  1. We champion truly shareable data without licensing restrictions - with proper @@ -410,8 +411,8 @@ multiple resources.
-
-

6 Why should I not upload by data here?

+
+

6 Why should I not upload by data here?

Funny question. There are only good reasons to upload your data here @@ -433,8 +434,8 @@ for bulk uploads!

-
-

7 How does the public sequence resource work?

+
+

7 How does the public sequence resource work?

On uploading a sequence with metadata it will automatically be @@ -445,8 +446,8 @@ using workflows from the High Performance Open Biology Lab defined

-
-

8 Who uses the public sequence resource?

+
+

8 Who uses the public sequence resource?

The Swiss Institute of Bioinformatics has included this data in @@ -464,8 +465,8 @@ for monitoring, protein prediction and drug development.

-
-

9 How can I contribute?

+
+

9 How can I contribute?

You can contribute by submitting sequences, updating metadata, submit @@ -477,8 +478,8 @@ point.

-
-

10 Is this about open data?

+
+

10 Is this about open data?

All data is published under a Creative Commons 4.0 attribution license @@ -488,8 +489,8 @@ data and store it for further processing.

-
-

11 Is this about free software?

+
+

11 Is this about free software?

Absolutely. Free software allows for fully reproducible pipelines. You @@ -498,8 +499,8 @@ can take our workflows and data and run it elsewhere!

-
-

12 How do I upload raw data?

+
+

12 How do I upload raw data?

We are preparing raw sequence data pipelines (fastq and BAM). The @@ -514,8 +515,8 @@ assembly variations into consideration. This is all work in progress.

-
-

13 How do I change metadata?

+
+

13 How do I change metadata?

-
-

14 How do I change the work flows?

+
+

14 How do I change the work flows?

Workflows are on github and can be modified. See also the BLOG @@ -533,8 +534,8 @@ Workflows are on -

15 How do I change the source code?

+
+

15 How do I change the source code?

Go to our source code repositories, fork/clone the repository, change @@ -544,8 +545,8 @@ many PRs we already merged.

-
-

16 Should I choose CC-BY or CC0?

+
+

16 Should I choose CC-BY or CC0?

Restrictive data licenses are hampering data sharing and reproducible @@ -561,8 +562,8 @@ In all honesty: we prefer both data and software to be free.

-
-

17 Are there also variant in the RDF databases? *

+
+

17 Are there also variant in the RDF databases?

We do output a RDF file with the pangenome built in, and you can parse it because it has variants implicitly. @@ -574,8 +575,8 @@ We are also writing tools to generate VCF files directly from the pangenome.

-
-

18 How do I deal with private data and privacy?

+
+

18 How do I deal with private data and privacy?

A public sequence resource is about public data. Metadata can refer to @@ -586,8 +587,8 @@ plan to combine identifiers with clinical data stored securely at

-
-

19 Do you have any checks or concerns if human sequence accidentally submitted to your service as part of a fastq? *

+
+

19 Do you have any checks or concerns if human sequence accidentally submitted to your service as part of a fastq? *

We are planning to remove reads that match the human reference. @@ -595,8 +596,8 @@ We are planning to remove reads that match the human reference.

-
-

20 Does PubSeq support only SARS-CoV-2 data? *

+
+

20 Does PubSeq support only SARS-CoV-2 data?

To date, PubSeq is a resource specific to SARS-CoV-2, but we are designing it to be able to support other species in the future. @@ -605,8 +606,8 @@ To date, PubSeq is a resource specific to SARS-CoV-2, but we are designing it to

-
-

21 How do I communicate with you?

+
+

21 How do I communicate with you?

We use a gitter channel you can join. See also contact. @@ -614,10 +615,26 @@ We use a -

22 Who are the sponsors?

+ + + +
+

23 Who are the sponsors?

+
+

The main sponsors are listed in the footer. In addition to the time generously donated by many contributors we also acknowledge Amazon AWS for donating COVID-19 related compute time. @@ -626,7 +643,7 @@ for donating COVID-19 related compute time.

-
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-08-23 Sun 04:26
. +
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-08-24 Mon 03:12
.
diff --git a/doc/web/about.org b/doc/web/about.org index 1b7bda1..d8d7f73 100644 --- a/doc/web/about.org +++ b/doc/web/about.org @@ -18,11 +18,12 @@ - [[#how-do-i-change-the-work-flows][How do I change the work flows?]] - [[#how-do-i-change-the-source-code][How do I change the source code?]] - [[#should-i-choose-cc-by-or-cc0][Should I choose CC-BY or CC0?]] - - [[#are-there-also-variant-in-the-rdf-databases-][Are there also variant in the RDF databases? *]] + - [[#are-there-also-variant-in-the-rdf-databases][Are there also variant in the RDF databases?]] - [[#how-do-i-deal-with-private-data-and-privacy][How do I deal with private data and privacy?]] - [[#do-you-have-any-checks-or-concerns-if-human-sequence-accidentally-submitted-to-your-service-as-part-of-a-fastq-][Do you have any checks or concerns if human sequence accidentally submitted to your service as part of a fastq? *]] - - [[#does-pubseq-support-only-sars-cov-2-data-][Does PubSeq support only SARS-CoV-2 data? *]] + - [[#does-pubseq-support-only-sars-cov-2-data][Does PubSeq support only SARS-CoV-2 data?]] - [[#how-do-i-communicate-with-you][How do I communicate with you?]] + - [[#citing-pubseq][Citing PubSeq]] - [[#who-are-the-sponsors][Who are the sponsors?]] * What is the 'public sequence resource' about? @@ -209,7 +210,7 @@ because we know people like the attribution clause. In all honesty: we prefer both data and software to be free. -* Are there also variant in the RDF databases? * +* Are there also variant in the RDF databases? We do output a RDF file with the pangenome built in, and you can parse it because it has variants implicitly. @@ -226,7 +227,7 @@ plan to combine identifiers with clinical data stored securely at We are planning to remove reads that match the human reference. -* Does PubSeq support only SARS-CoV-2 data? * +* Does PubSeq support only SARS-CoV-2 data? To date, PubSeq is a resource specific to SARS-CoV-2, but we are designing it to be able to support other species in the future. @@ -235,6 +236,15 @@ To date, PubSeq is a resource specific to SARS-CoV-2, but we are designing it to We use a [[https://gitter.im/arvados/pubseq?utm_source=share-link&utm_medium=link&utm_campaign=share-link][gitter channel]] you can join. See also [[./contact][contact]]. +* Citing PubSeq + +We have two publications in the works. Until we have a DOI please cite +PubSeq in the following way: + +We made use of the COVID-19 public sequence (PubSeq) resources hosted +at http://covid19.genenetwork.org/. + + * Who are the sponsors? The main sponsors are listed in the footer. In addition to the time diff --git a/doc/web/download.html b/doc/web/download.html index 1d196a0..998c87b 100644 --- a/doc/web/download.html +++ b/doc/web/download.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + Download @@ -247,34 +247,35 @@ for the JavaScript code in this tag.

Table of Contents

-
-

1 Workflow runs

+
+

1 Workflow runs

The last runs can be viewed here. If you click on a run you can see @@ -285,8 +286,8 @@ is listed under Data collections. All current data is listed

-
-

2 FASTA files

+
+

2 FASTA files

The public sequence resource provides all uploaded sequences as @@ -296,8 +297,8 @@ also provide a single file -

3 Metadata

+
+

3 Metadata

Metadata can be downloaded as Turtle RDF as a mergedmetadat.ttl which @@ -319,8 +320,8 @@ graph can be downloaded from below Pangenome RDF format.

-
-

4 Pangenome

+
+

4 Pangenome

Pangenome data is made available in multiple guises. Variation graphs @@ -328,8 +329,8 @@ Pangenome data is made available in multiple guises. Variation graphs

-
-

4.1 Pangenome GFA format

+
+

4.1 Pangenome GFA format

GFA is a standard for graphical fragment assembly and consumed @@ -338,8 +339,8 @@ by tools such as vgtools.

-
-

4.2 Pangenome in ODGI format

+
+

4.2 Pangenome in ODGI format

ODGI is a format that supports an optimised dynamic genome/graph @@ -348,8 +349,8 @@ implementation.

-
-

4.3 Pangenome RDF format

+
+

4.3 Pangenome RDF format

An RDF file that includes the sequences themselves in a variation @@ -360,8 +361,8 @@ graph can be downloaded from

-
-

4.4 Pangenome Browser format

+
+

4.4 Pangenome Browser format

The many JSON files that are named as @@ -372,8 +373,8 @@ Pangenome browser.

-
-

5 Log of workflow output

+
+

5 Log of workflow output

Including in below link is a log file of the last workflow runs. @@ -381,8 +382,8 @@ Including in below link is a log file of the last workflow runs.

-
-

6 All files

+
+

6 All files

https://collections.lugli.arvadosapi.com/c=lugli-4zz18-z513nlpqm03hpca/ @@ -390,16 +391,16 @@ Including in below link is a log file of the last workflow runs.

-
-

7 Planned

+
+

7 Planned

We are planning the add the following output (see also

-
-

7.1 Raw sequence data

+
+

7.1 Raw sequence data

See fastq tracker and BAM tracker. @@ -407,8 +408,8 @@ See fastq track

-
-

7.2 Multiple Sequence Alignment (MSA)

+
-
-

7.3 Phylogenetic tree

+
-
-

7.4 Protein prediction

+
+

7.4 Protein prediction

We aim to make protein predictions available. @@ -435,8 +436,8 @@ We aim to make protein predictions available.

-
-
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-06-12 Fri 04:41
. +
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-08-24 Mon 03:07
.
diff --git a/doc/web/download.org b/doc/web/download.org index 7614c60..a3f1949 100644 --- a/doc/web/download.org +++ b/doc/web/download.org @@ -18,6 +18,7 @@ - [[#phylogenetic-tree][Phylogenetic tree]] - [[#protein-prediction][Protein prediction]] - [[#source-code][Source code]] + - [[#citing-pubseq][Citing PubSeq]] * Workflow runs @@ -107,3 +108,7 @@ We aim to make protein predictions available. All source code for this website and tooling is available from https://github.com/arvados/bh20-seq-resource + +* Citing PubSeq + +See the [[./about][FAQ]]. -- cgit v1.2.3