PubSeq REST API
Table of Contents
1 PubSeq REST API
Here we document the public REST API that comes with PubSeq. The tests run in emacs org-babel. See the bottom of this document for running the tests inside emacs. See bottom of the page how to run tests.
1.1 Introduction
We built a REST API for COVID-19 PubSeq. The API source code can be found in api.py. To see if the service is up try
curl http://covid19.genenetwork.org/api/version
{ "service": "PubSeq", "version": 0.1 }
The Python3 version is
import requests baseURL="http://localhost:5067" # for development # baseURL="http://covid19.genenetwork.org" response = requests.get(baseURL+"/api/version") response_body = response.json() assert response_body["service"] == "PubSeq", "PubSeq API not found" response_body
service | : | PubSeq | version | : | 0.1 |
1.2 Search for an entry
When you use the search box on PubSeq it queries the REST end point for information on the search items. For example
requests.get(baseURL+"/api/search?s=MT533203.1").json()
where collection is the raw uploaded data. The hash value in c=
is
computed on the contents of the Arvados keep collection and effectively
acts as a deduplication uuid.
1.3 Fetch metadata
Using above collection link you can fetch the metadata in JSON as it was uploaded originally from the SHeX expression, e.g. using https://collections.lugli.arvadosapi.com/c=0015b0d65dfd2e82bb3cee4436bf2893+126/
But better to use the more advanced sample metadata fetcher because is does a bit more in terms of expansion
requests.get(baseURL+"/api/sample/MT533203.1.json").json()
collection | : | http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126 | date | : | 2020-04-27 | fasta | : | http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126/sequence.fasta | id | : | MT533203.1 | info | : | http://identifiers.org/insdc/MT533203.1#sequence | mapper | : | minimap v. 2.17 | sequencer | : | http://www.ebi.ac.uk/efo/EFO_0008632 | specimen | : | http://purl.obolibrary.org/obo/NCIT_C155831 |
1.4 Fetch EBI XML
PubSeq provides an API that is used to export formats that are suitable for uploading data to EBI/ENA from our EXPORT menu. This is documented here.
requests.get(baseURL+"/api/ebi/sample-MT326090.1.xml").text
<?xml version="1.0" encoding="UTF-8"?> <SAMPLE_SET> <SAMPLE alias="MT326090.1" center_name="COVID-19 PubSeq"> <TITLE>COVID-19 PubSeq Sample</TITLE> <SAMPLE_NAME> <TAXON_ID>2697049</TAXON_ID> <SCIENTIFIC_NAME>Severe acute respiratory syndrome coronavirus 2</SCIENTIFIC_NAME> <COMMON_NAME>SARS-CoV-2</COMMON_NAME> </SAMPLE_NAME> <SAMPLE_ATTRIBUTES> <SAMPLE_ATTRIBUTE> <TAG>investigation type</TAG> <VALUE></VALUE> </SAMPLE_ATTRIBUTE> <SAMPLE_ATTRIBUTE> <TAG>sequencing method</TAG> <VALUE>http://purl.obolibrary.org/obo/OBI_0000759</VALUE> </SAMPLE_ATTRIBUTE> <SAMPLE_ATTRIBUTE> <TAG>collection date</TAG> <VALUE>2020-03-21</VALUE> </SAMPLE_ATTRIBUTE> <SAMPLE_ATTRIBUTE> <TAG>geographic location (latitude)</TAG> <VALUE></VALUE> <UNITS>DD</UNITS> </SAMPLE_ATTRIBUTE> <SAMPLE_ATTRIBUTE> <TAG>geographic location (longitude)</TAG> <VALUE></VALUE> <UNITS>DD</UNITS> </SAMPLE_ATTRIBUTE> <SAMPLE_ATTRIBUTE> <TAG>geographic location (country and/or sea)</TAG> <VALUE></VALUE> </SAMPLE_ATTRIBUTE> <SAMPLE_ATTRIBUTE> <TAG>geographic location (region and locality)</TAG> <VALUE></VALUE> </SAMPLE_ATTRIBUTE> <SAMPLE_ATTRIBUTE> <TAG>environment (material)</TAG> <VALUE>http://purl.obolibrary.org/obo/NCIT_C155831</VALUE> </SAMPLE_ATTRIBUTE> <SAMPLE_ATTRIBUTE> <TAG>ENA-CHECKLIST</TAG> <VALUE>ERC000011</VALUE> </SAMPLE_ATTRIBUTE> </SAMPLE_ATTRIBUTES> </SAMPLE> </SAMPLE_SET>
2 Configure emacs to run tests
Execute a code block with C-c C-c. You may need to set
(org-babel-do-load-languages 'org-babel-load-languages '((python . t))) (setq org-babel-python-command "python3") (setq org-babel-eval-verbose t) (setq org-confirm-babel-evaluate nil)
To skip confirmations you may also want to set
(setq org-confirm-babel-evaluate nil)
To see output of the inpreter open then Python buffer.