UP | HOME

PubSeq REST API

Table of Contents

1 PubSeq REST API

Here we document the public REST API that comes with PubSeq. The tests run in emacs org-babel. See the bottom of this document for running the tests inside emacs. See bottom of the page how to run tests.

1.1 Introduction

We built a REST API for COVID-19 PubSeq. The API source code can be found in api.py. To see if the service is up try

curl http://covid19.genenetwork.org/api/version
{
  "service": "PubSeq",
  "version": 0.1
}

The Python3 version is

import requests
baseURL="http://localhost:5067" # for development
# baseURL="http://covid19.genenetwork.org"
response = requests.get(baseURL+"/api/version")
response_body = response.json()
assert response_body["service"] == "PubSeq", "PubSeq API not found"
response_body
service : PubSeq version : 0.1

1.2 Search for an entry

When you use the search box on PubSeq it queries the REST end point for information on the search items. For example

requests.get(baseURL+"/api/search?s=MT533203.1").json()
collection : http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126 fasta : http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126/sequence.fasta id : MT533203.1 info : http://identifiers.org/insdc/MT533203.1#sequence

where collection is the raw uploaded data. The hash value in c= is computed on the contents of the Arvados keep collection and effectively acts as a deduplication uuid.

1.3 Fetch metadata

Using above collection link you can fetch the metadata in JSON as it was uploaded originally from the SHeX expression, e.g. using https://collections.lugli.arvadosapi.com/c=0015b0d65dfd2e82bb3cee4436bf2893+126/

But better to use the more advanced sample metadata fetcher because is does a bit more in terms of expansion

requests.get(baseURL+"/api/sample/MT533203.1.json").json()
collection : http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126 date : 2020-04-27 fasta : http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126/sequence.fasta id : MT533203.1 info : http://identifiers.org/insdc/MT533203.1#sequence mapper : minimap v. 2.17 sequencer : http://www.ebi.ac.uk/efo/EFO_0008632 specimen : http://purl.obolibrary.org/obo/NCIT_C155831

1.4 Fetch EBI XML

PubSeq provides an API that is used to export formats that are suitable for uploading data to EBI/ENA from our EXPORT menu. This is documented here.

requests.get(baseURL+"/api/ebi/sample-MT326090.1.xml").text
<?xml version="1.0" encoding="UTF-8"?>
<SAMPLE_SET>
  <SAMPLE alias="MT326090.1" center_name="COVID-19 PubSeq">
    <TITLE>COVID-19 PubSeq Sample</TITLE>
    <SAMPLE_NAME>
      <TAXON_ID>2697049</TAXON_ID>
      <SCIENTIFIC_NAME>Severe acute respiratory syndrome coronavirus 2</SCIENTIFIC_NAME>
      <COMMON_NAME>SARS-CoV-2</COMMON_NAME>
    </SAMPLE_NAME>
    <SAMPLE_ATTRIBUTES>
      <SAMPLE_ATTRIBUTE>
        <TAG>investigation type</TAG>
        <VALUE></VALUE>
      </SAMPLE_ATTRIBUTE>
      <SAMPLE_ATTRIBUTE>
        <TAG>sequencing method</TAG>
        <VALUE>http://purl.obolibrary.org/obo/OBI_0000759</VALUE>
      </SAMPLE_ATTRIBUTE>
      <SAMPLE_ATTRIBUTE>
        <TAG>collection date</TAG>
        <VALUE>2020-03-21</VALUE>
      </SAMPLE_ATTRIBUTE>
      <SAMPLE_ATTRIBUTE>
        <TAG>geographic location (latitude)</TAG>
        <VALUE></VALUE>
     <UNITS>DD</UNITS>
      </SAMPLE_ATTRIBUTE>
      <SAMPLE_ATTRIBUTE>
        <TAG>geographic location (longitude)</TAG>
        <VALUE></VALUE>
     <UNITS>DD</UNITS>
      </SAMPLE_ATTRIBUTE>
      <SAMPLE_ATTRIBUTE>
     <TAG>geographic location (country and/or sea)</TAG>
     <VALUE></VALUE>
      </SAMPLE_ATTRIBUTE>
      <SAMPLE_ATTRIBUTE>
        <TAG>geographic location (region and locality)</TAG>
        <VALUE></VALUE>
      </SAMPLE_ATTRIBUTE>
      <SAMPLE_ATTRIBUTE>
        <TAG>environment (material)</TAG>
        <VALUE>http://purl.obolibrary.org/obo/NCIT_C155831</VALUE>
      </SAMPLE_ATTRIBUTE>
      <SAMPLE_ATTRIBUTE>
        <TAG>ENA-CHECKLIST</TAG>
        <VALUE>ERC000011</VALUE>
      </SAMPLE_ATTRIBUTE>
    </SAMPLE_ATTRIBUTES>
  </SAMPLE>
</SAMPLE_SET>

2 Configure emacs to run tests

Execute a code block with C-c C-c. You may need to set

(org-babel-do-load-languages
 'org-babel-load-languages
 '((python . t)))
(setq org-babel-python-command "python3")
(setq org-babel-eval-verbose t)
(setq org-confirm-babel-evaluate nil)

To skip confirmations you may also want to set

(setq org-confirm-babel-evaluate nil)

To see output of the inpreter open then Python buffer.


Created by Pjotr Prins (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!
Modified 2020-11-05 Thu 05:21
.