From 50a9933a997e468db3343023a580308b28edc653 Mon Sep 17 00:00:00 2001 From: Pjotr Prins Date: Fri, 29 May 2020 08:29:31 -0500 Subject: Docs: added note about workflow runs --- doc/web/about.html | 108 ++++++++++++++++++++++--------------------- doc/web/about.org | 2 + doc/web/download.html | 124 +++++++++++++++++++++++++++----------------------- doc/web/download.org | 5 ++ 4 files changed, 129 insertions(+), 110 deletions(-) diff --git a/doc/web/about.html b/doc/web/about.html index bad4bb1..c907e6c 100644 --- a/doc/web/about.html +++ b/doc/web/about.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
- +The public sequence resource aims to provide a generic and useful @@ -280,8 +280,8 @@ sequence comparison and protein prediction.
The public sequence resource is an initiative by bioinformatics and @@ -301,8 +301,8 @@ wrangling experts. Thank you everyone!
The short version is that we use state-of-the-art practices in @@ -321,8 +321,8 @@ public resources, including GISAID.
Funny question. There are only good reasons to upload your data here @@ -374,8 +376,8 @@ for bulk uploads!
On uploading a sequence with metadata it will automatically be @@ -386,8 +388,8 @@ using workflows from the High Performance Open Biology Lab defined
The Swiss Institute of Bioinformatics has included this data in @@ -401,8 +403,8 @@ drug development.
All data is published under a Creative Commons 4.0 attribution license @@ -412,8 +414,8 @@ data and store it for further processing.
Absolutely. Free software allows for fully reproducible pipelines. You @@ -422,8 +424,8 @@ can take our workflows and data and run it elsewhere!
We are preparing raw sequence data pipelines (fastq and BAM). The @@ -438,8 +440,8 @@ assembly variations into consideration. This is all work in progress.
See the http://covid19.genenetwork.org/blog! @@ -447,8 +449,8 @@ See the http://covid19.genenetwork
See the http://covid19.genenetwork.org/blog! @@ -456,8 +458,8 @@ See the http://covid19.genenetwork
Go to our source code repositories, fork/clone the repository, change @@ -467,8 +469,8 @@ many PRs we already merged.
Restrictive data licenses are hampering data sharing and reproducible @@ -484,8 +486,8 @@ In all honesty: we prefer both data and software to be free.
A public sequence resource is about public data. Metadata can refer to @@ -496,8 +498,8 @@ plan to combine identifiers with clinical data stored securely at
We use a gitter channel you can join.
@@ -505,8 +507,8 @@ We use a
-
The main sponsors are listed in the footer. In addition to the time
@@ -517,7 +519,7 @@ for donating COVID-19 related compute time.
17 Who are the sponsors?
+17 Who are the sponsors?
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-28 Thu 08:40.
+
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-29 Fri 08:26.
Table of Contents
+The last runs can be viewed here. +
+
The public sequence resource provides all uploaded sequences as
FASTA files. They can be referred to from metadata individually. We
also provide a single file FASTA download.
@@ -282,9 +292,9 @@ also provide a single file
-
Metadata can be downloaded as Turtle RDF as a mergedmetadat.ttl which
can be loaded into any RDF triple-store. We provide a Virtuoso SPARQL
@@ -305,18 +315,18 @@ graph can be downloaded from below Pangenome RDF format.
Pangenome data is made available in multiple guises. Variation graphs
(VG) provide a succinct encoding of the sequences of many genomes.
ODGI is a format that supports an optimised dynamic genome/graph
implementation.
@@ -334,9 +344,9 @@ implementation.
An RDF file that includes the sequences themselves in a variation
graph can be downloaded from
@@ -346,9 +356,9 @@ graph can be downloaded from
The many JSON files that are named as
results/1/chunk001200.bin1.schematic.json are consumed by the
@@ -358,62 +368,62 @@ Pangenome browser.
Including in below link is a log file of the last workflow runs.
We are planning the add the following output (see also
See fastq tracker and BAM tracker.
See MSA tracker.
See Phylo tracker.
We aim to make protein predictions available.
2 Metadata
-3 Metadata
+3 Pangenome
-4 Pangenome
+3.1 Pangenome GFA format
-4.1 Pangenome GFA format
+
3.2 Pangenome in ODGI format
-4.2 Pangenome in ODGI format
+3.3 Pangenome RDF format
-4.3 Pangenome RDF format
+3.4 Pangenome Browser format
-4.4 Pangenome Browser format
+4 Log of workflow output
-5 Log of workflow output
+5 All files
-6 Planned
-7 Planned
+6.1 Raw sequence data
-7.1 Raw sequence data
+6.2 Multiple Sequence Alignment (MSA)
-7.2 Multiple Sequence Alignment (MSA)
+6.3 Phylogenetic tree
-7.3 Phylogenetic tree
+6.4 Protein prediction
-7.4 Protein prediction
+
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-24 Sun 11:29.
+
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-29 Fri 08:27.