PubSeq REST API
Table of Contents
1 PubSeq REST API
Here we document the public REST API that comes with PubSeq. The tests run in the amazing emacs org-babel. See the bottom of this document for running the tests inside emacs.
1.1 Introduction
We built a REST API for COVID-19 PubSeq. The API source code can be found in api.py. To see if the service is up try
curl http://covid19.genenetwork.org/api/version
{
"service": "PubSeq",
"version": 0.1
}
The Python3 version is
import requests baseURL="http://localhost:5000" # for development # baseURL="http://covid19.genenetwork.org" response = requests.get(baseURL+"/api/version") response_body = response.json() assert response_body["service"] == "PubSeq", "PubSeq API not found" response_body
| service | : | PubSeq | version | : | 0.1 |
1.2 Search for an entry
When you use the search box on PubSeq it queries the REST end point for information on the search items. For example
requests.get(baseURL+"/api/search?s=MT533203.1").json()
where collection is the raw uploaded data. The hash value in c= is
computed on the contents of the Arvados keep collection and effectively
acts as a deduplication uuid.
1.3 Fetch metadata
Using above collection link you can fetch the metadata in JSON as it was uploaded originally from the SHeX expression, e.g. using https://collections.lugli.arvadosapi.com/c=0015b0d65dfd2e82bb3cee4436bf2893+126/
But better to use the more advanced sample metadata fetcher because is does a bit more in terms of expansion
requests.get(baseURL+"/api/sample/MT533203.1.json").json()
| collection | : | http://collections.lugli.arvadosapi.com/c=0015b0d65dfd2e82bb3cee4436bf2893+126 | date | : | 2020-04-27 | fasta | : | http://collections.lugli.arvadosapi.com/c=0015b0d65dfd2e82bb3cee4436bf2893+126/sequence.fasta | id | : | MT533203.1 | info | : | http://identifiers.org/insdc/MT533203.1#sequence | mapper | : | minimap v. 2.17 | sequencer | : | http://www.ebi.ac.uk/efo/EFO_0008632 | specimen | : | http://purl.obolibrary.org/obo/NCIT_C155831 |
1.4 Fetch EBI XML
PubSeq provides an API that is used to export formats that are suitable for uploading data to EBI/ENA from our EXPORT menu. This is documented here.
requests.get(baseURL+"/api/ebi/sample-MT326090.1.xml").text
<?xml version="1.0" encoding="UTF-8"?>
<SAMPLE_SET>
<SAMPLE alias="MT326090.1" center_name="COVID-19 PubSeq">
<TITLE>COVID-19 PubSeq Sample</TITLE>
<SAMPLE_NAME>
<TAXON_ID>2697049</TAXON_ID>
<SCIENTIFIC_NAME>Severe acute respiratory syndrome coronavirus 2</SCIENTIFIC_NAME>
<COMMON_NAME>SARS-CoV-2</COMMON_NAME>
</SAMPLE_NAME>
<SAMPLE_ATTRIBUTES>
<SAMPLE_ATTRIBUTE>
<TAG>investigation type</TAG>
<VALUE></VALUE>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
<TAG>sequencing method</TAG>
<VALUE>http://purl.obolibrary.org/obo/OBI_0000759</VALUE>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
<TAG>collection date</TAG>
<VALUE>2020-03-21</VALUE>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
<TAG>geographic location (latitude)</TAG>
<VALUE></VALUE>
<UNITS>DD</UNITS>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
<TAG>geographic location (longitude)</TAG>
<VALUE></VALUE>
<UNITS>DD</UNITS>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
<TAG>geographic location (country and/or sea)</TAG>
<VALUE></VALUE>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
<TAG>geographic location (region and locality)</TAG>
<VALUE></VALUE>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
<TAG>environment (material)</TAG>
<VALUE>http://purl.obolibrary.org/obo/NCIT_C155831</VALUE>
</SAMPLE_ATTRIBUTE>
<SAMPLE_ATTRIBUTE>
<TAG>ENA-CHECKLIST</TAG>
<VALUE>ERC000011</VALUE>
</SAMPLE_ATTRIBUTE>
</SAMPLE_ATTRIBUTES>
</SAMPLE>
</SAMPLE_SET>
2 Configure emacs to run tests
Execute a code block with C-c C-c. You may need to set
(org-babel-do-load-languages 'org-babel-load-languages '((python . t))) (setq org-babel-python-command "python3") (setq org-babel-eval-verbose t)
To skip confirmations you may also want to set
(setq org-confirm-babel-evaluate nil)
To see output of the inpreter open then Python buffer.