Download

Table of Contents

1 FASTA files

The public sequence resource provides all uploaded sequences as FASTA files. They can be referred to from metadata individually. We also provide a single file FASTA download.

2 Metadata

Metadata can be downloaded as Turtle RDF as a mergedmetadat.ttl which can be loaded into any RDF triple-store. We provide a Virtuoso SPARQL endpoint ourselves which can be queried from http://sparql.genenetwork.org/sparql/. Query examples can be found in our BLOG.

The Swiss Institute of Bioinformatics has included this data in https://covid-19-sparql.expasy.org/ and made it part of Uniprot.

An RDF file that includes the sequences themselves in a variation graph can be downloaded from below Pangenome RDF format.

3 Pangenome

Pangenome data is made available in multiple guises. Variation graphs (VG) provide a succinct encoding of the sequences of many genomes.

3.1 Pangenome GFA format

GFA is a standard for graphical fragment assembly and consumed by tools such as vgtools.

3.2 Pangenome in ODGI format

ODGI is a format that supports an optimized dynamic genome/graph implementation.

3.3 Pangenome RDF format

An RDF file that includes the sequences themselves in a variation graph can be downloaded from relabeledSeqs-dedup-relabeledSeqs-dedup.ttl.xz.

3.4 Pangenome Browser format

The many JSON files that are named as results/1/chunk001200.bin1.schematic.json are consumed by the Pangenome browser.

4 Log of workflow output

Including in below link is a log file of the last workflow runs.

5 All files


Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-05-24 Sun 11:11
.