From 50a9933a997e468db3343023a580308b28edc653 Mon Sep 17 00:00:00 2001 From: Pjotr Prins Date: Fri, 29 May 2020 08:29:31 -0500 Subject: Docs: added note about workflow runs --- doc/web/about.html | 108 ++++++++++++++++++++++--------------------- doc/web/about.org | 2 + doc/web/download.html | 124 +++++++++++++++++++++++++++----------------------- doc/web/download.org | 5 ++ 4 files changed, 129 insertions(+), 110 deletions(-) (limited to 'doc') diff --git a/doc/web/about.html b/doc/web/about.html index bad4bb1..c907e6c 100644 --- a/doc/web/about.html +++ b/doc/web/about.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + About/FAQ @@ -247,29 +247,29 @@ for the JavaScript code in this tag.

Table of Contents

-
-

1 What is the 'public sequence resource' about?

+
+

1 What is the 'public sequence resource' about?

The public sequence resource aims to provide a generic and useful @@ -280,8 +280,8 @@ sequence comparison and protein prediction.

-
-

2 Who created the public sequence resource?

+
+

2 Who created the public sequence resource?

The public sequence resource is an initiative by bioinformatics and @@ -301,8 +301,8 @@ wrangling experts. Thank you everyone!

-
-

3 How does the public sequence resource compare to other data resources?

+
+

3 How does the public sequence resource compare to other data resources?

The short version is that we use state-of-the-art practices in @@ -321,8 +321,8 @@ public resources, including GISAID.

-
-

4 Why should I upload my data here?

+
+

4 Why should I upload my data here?

  1. We champion truly shareable data without licensing restrictions - with proper @@ -332,6 +332,8 @@ attribution
  2. for bulk uploads
  3. We provide a live SPARQL end-point for all metadata
  4. We provide free data analysis and sequence comparison triggered on data upload
  5. +
  6. We do real work for you, with this link you can see the last +run took 5.5 hours!
  7. We provide free downloads of all computed output
  8. There is no need to set up pipelines and/or compute clusters
  9. All workflows get triggered on uploading a new sequence
  10. @@ -351,8 +353,8 @@ multiple resources.
-
-

5 Why should I not upload by data here?

+
+

5 Why should I not upload by data here?

Funny question. There are only good reasons to upload your data here @@ -374,8 +376,8 @@ for bulk uploads!

-
-

6 How does the public sequence resource work?

+
+

6 How does the public sequence resource work?

On uploading a sequence with metadata it will automatically be @@ -386,8 +388,8 @@ using workflows from the High Performance Open Biology Lab defined

-
-

7 Who uses the public sequence resource?

+
+

7 Who uses the public sequence resource?

The Swiss Institute of Bioinformatics has included this data in @@ -401,8 +403,8 @@ drug development.

-
-

8 Is this about open data?

+
+

8 Is this about open data?

All data is published under a Creative Commons 4.0 attribution license @@ -412,8 +414,8 @@ data and store it for further processing.

-
-

9 Is this about free software?

+
+

9 Is this about free software?

Absolutely. Free software allows for fully reproducible pipelines. You @@ -422,8 +424,8 @@ can take our workflows and data and run it elsewhere!

-
-

10 How do I upload raw data?

+
+

10 How do I upload raw data?

We are preparing raw sequence data pipelines (fastq and BAM). The @@ -438,8 +440,8 @@ assembly variations into consideration. This is all work in progress.

-
-

11 How do I change metadata?

+
+

11 How do I change metadata?

-
-

12 How do I change the work flows?

+
-
-

13 How do I change the source code?

+
+

13 How do I change the source code?

Go to our source code repositories, fork/clone the repository, change @@ -467,8 +469,8 @@ many PRs we already merged.

-
-

14 Should I choose CC-BY or CC0?

+
+

14 Should I choose CC-BY or CC0?

Restrictive data licenses are hampering data sharing and reproducible @@ -484,8 +486,8 @@ In all honesty: we prefer both data and software to be free.

-
-

15 How do I deal with private data and privacy?

+
+

15 How do I deal with private data and privacy?

A public sequence resource is about public data. Metadata can refer to @@ -496,8 +498,8 @@ plan to combine identifiers with clinical data stored securely at

-
-

16 How do I communicate with you?

+
+

16 How do I communicate with you?

We use a gitter channel you can join. @@ -505,8 +507,8 @@ We use a -

17 Who are the sponsors?

+
+

17 Who are the sponsors?

The main sponsors are listed in the footer. In addition to the time @@ -517,7 +519,7 @@ for donating COVID-19 related compute time.

-
Created by
Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-28 Thu 08:40
. +
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-29 Fri 08:26
.
diff --git a/doc/web/about.org b/doc/web/about.org index 8e535dc..b6387e7 100644 --- a/doc/web/about.org +++ b/doc/web/about.org @@ -63,6 +63,8 @@ public resources, including GISAID. for bulk uploads 3. We provide a live SPARQL end-point for all metadata 2. We provide free data analysis and sequence comparison triggered on data upload +3. We do real work for you, with this [[https://workbench.lugli.arvadosapi.com/container_requests/lugli-xvhdp-bhhk4nxx1lch5od][link]] you can see the last + run took 5.5 hours! 4. We provide free downloads of all computed output 3. There is no need to set up pipelines and/or compute clusters 4. All workflows get triggered on uploading a new sequence diff --git a/doc/web/download.html b/doc/web/download.html index 493af11..2fde013 100644 --- a/doc/web/download.html +++ b/doc/web/download.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> - + Download @@ -247,34 +247,44 @@ for the JavaScript code in this tag.

Table of Contents

-
-

1 FASTA files

+
+

1 Workflow runs

+The last runs can be viewed here. +

+
+
+ +
+

2 FASTA files

+
+

The public sequence resource provides all uploaded sequences as FASTA files. They can be referred to from metadata individually. We also provide a single file FASTA download. @@ -282,9 +292,9 @@ also provide a single file -

2 Metadata

-
+
+

3 Metadata

+

Metadata can be downloaded as Turtle RDF as a mergedmetadat.ttl which can be loaded into any RDF triple-store. We provide a Virtuoso SPARQL @@ -305,18 +315,18 @@ graph can be downloaded from below Pangenome RDF format.

-
-

3 Pangenome

-
+
+

4 Pangenome

+

Pangenome data is made available in multiple guises. Variation graphs (VG) provide a succinct encoding of the sequences of many genomes.

-
-

3.1 Pangenome GFA format

-
+
+

4.1 Pangenome GFA format

+

GFA is a standard for graphical fragment assembly and consumed by tools such as vgtools. @@ -324,9 +334,9 @@ by tools such as vgtools.

-
-

3.2 Pangenome in ODGI format

-
+
+

4.2 Pangenome in ODGI format

+

ODGI is a format that supports an optimised dynamic genome/graph implementation. @@ -334,9 +344,9 @@ implementation.

-
-

3.3 Pangenome RDF format

-
+
+

4.3 Pangenome RDF format

+

An RDF file that includes the sequences themselves in a variation graph can be downloaded from @@ -346,9 +356,9 @@ graph can be downloaded from

-
-

3.4 Pangenome Browser format

-
+
+

4.4 Pangenome Browser format

+

The many JSON files that are named as results/1/chunk001200.bin1.schematic.json are consumed by the @@ -358,62 +368,62 @@ Pangenome browser.

-
-

4 Log of workflow output

-
+
+

5 Log of workflow output

+

Including in below link is a log file of the last workflow runs.

-
-

5 All files

-
+ -
-

6 Planned

-
+
+

7 Planned

+

We are planning the add the following output (see also

-
-

6.1 Raw sequence data

-
+
+

7.1 Raw sequence data

+
-
-

6.2 Multiple Sequence Alignment (MSA)

-
+
+

7.2 Multiple Sequence Alignment (MSA)

+
-
-

6.3 Phylogenetic tree

-
+
+

7.3 Phylogenetic tree

+
-
-

6.4 Protein prediction

-
+
+

7.4 Protein prediction

+

We aim to make protein predictions available.

@@ -422,7 +432,7 @@ We aim to make protein predictions available.
-
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-24 Sun 11:29
. +
Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-29 Fri 08:27
.
diff --git a/doc/web/download.org b/doc/web/download.org index 3b4c40a..2781d67 100644 --- a/doc/web/download.org +++ b/doc/web/download.org @@ -2,6 +2,7 @@ #+AUTHOR: Pjotr Prins * Table of Contents :TOC:noexport: + - [[#workflow-runs][Workflow runs]] - [[#fasta-files][FASTA files]] - [[#metadata][Metadata]] - [[#pangenome][Pangenome]] @@ -17,6 +18,10 @@ - [[#phylogenetic-tree][Phylogenetic tree]] - [[#protein-prediction][Protein prediction]] +* Workflow runs + +The last runs can be viewed [[https://workbench.lugli.arvadosapi.com/container_requests/lugli-xvhdp-bhhk4nxx1lch5od][here]]. + * FASTA files The *public sequence resource* provides all uploaded sequences as -- cgit v1.2.3