COVID-19 PubSeq Sample

# C-c C-e h h publish # C-c ! insert date (use . for active agenda, C-u C-c ! for date+time, C-u C-c . for time) # C-c C-t task rotate # RSS_IMAGE_URL: http://xxxx.xxxx.free.fr/rss_icon.png # C-c C-c to run test blocks # # This page runs tests and the HTML export doubles as documentation on # http://covid19.genenetwork.org/apidoc #+TITLE: PubSeq REST API #+AUTHOR: Pjotr Prins #+HTML_LINK_HOME: http://covid19.genenetwork.org/apidoc # OPTIONS: section-numbers: nil, with-drawers: t #+HTML_HEAD: * PubSeq REST API Here we document the public REST API that comes with PubSeq. The tests run in emacs [[https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-python.html][org-babel]]. See the bottom of this document for running the tests inside emacs. See bottom of the page how to run tests. ** Introduction We built a REST API for COVID-19 PubSeq. The API source code can be found in [[https://github.com/arvados/bh20-seq-resource/tree/master/bh20simplewebuploader/api.py][api.py]]. To see if the service is up try #+begin_src sh curl http://covid19.genenetwork.org/api/version #+end_src #+begin_src js { "service": "PubSeq", "version": 0.1 } #+end_src The current API can fetch data #+begin_src js curl http://covid19.genenetwork.org/api/search?s=MT533203.1 [ { "collection": "http://covid19.genenetwork.org/resource", "fasta": "http://covid19.genenetwork.org/resource/lugli-4zz18-uovend31hdwa5ks", "id": "MT533203.1", "info": "http://identifiers.org/insdc/MT533203.1#sequence" } ] curl http://covid19.genenetwork.org/api/sample/MT533203.1.json [ { "collection": "http://covid19.genenetwork.org/resource", "date": "2020-04-27", "fasta": "http://covid19.genenetwork.org/resource/lugli-4zz18-uovend31hdwa5ks", "id": "MT533203.1", "info": "http://identifiers.org/insdc/MT533203.1#sequence", "mapper": "minimap v. 2.17", "sequencer": "http://www.ebi.ac.uk/efo/EFO_0008632", "specimen": "http://purl.obolibrary.org/obo/NCIT_C155831" } ] #+end_src The Python3 version is #+begin_src python :session :exports both import requests baseURL="http://localhost:5067" # for development # baseURL="http://covid19.genenetwork.org" response = requests.get(baseURL+"/api/version") response_body = response.json() assert response_body["service"] == "PubSeq", "PubSeq API not found" response_body #+end_src #+RESULTS: | service | : | PubSeq | version | : | 0.1 | ** Search for an entry When you use the search box on PubSeq it queries the REST end point for information on the search items. For example #+begin_src python :session :exports both requests.get(baseURL+"/api/search?s=MT533203.1").json() #+end_src #+RESULTS: | collection | : | http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126 | fasta | : | http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126/sequence.fasta | id | : | MT533203.1 | info | : | http://identifiers.org/insdc/MT533203.1#sequence | where collection is the raw uploaded data. The hash value in ~c=~ is computed on the contents of the Arvados keep [[https://doc.arvados.org/v2.0/user/tutorials/tutorial-keep-mount-gnu-linux.html][collection]] and effectively acts as a deduplication uuid. ** Fetch metadata Using above collection link you can fetch the metadata in JSON as it was uploaded originally from the SHeX expression, e.g. using https://collections.lugli.arvadosapi.com/c=0015b0d65dfd2e82bb3cee4436bf2893+126/ But better to use the more advanced sample metadata fetcher because is does a bit more in terms of expansion #+begin_src python :session :exports both requests.get(baseURL+"/api/sample/MT533203.1.json").json() #+end_src #+RESULTS: | collection | : | http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126 | date | : | 2020-04-27 | fasta | : | http://collections.lugli.arvadosapi.com/c=b16901333ea1754a1e0409bf3caf7d22+126/sequence.fasta | id | : | MT533203.1 | info | : | http://identifiers.org/insdc/MT533203.1#sequence | mapper | : | minimap v. 2.17 | sequencer | : | http://www.ebi.ac.uk/efo/EFO_0008632 | specimen | : | http://purl.obolibrary.org/obo/NCIT_C155831 | ** Fetch EBI XML PubSeq provides an API that is used to export formats that are suitable for uploading data to EBI/ENA from our [[http://covid19.genenetwork.org/export][EXPORT]] menu. This is documented [[http://covid19.genenetwork.org/blog?id=using-covid-19-pubseq-part6][here]]. #+begin_src python :session :exports both requests.get(baseURL+"/api/ebi/sample-MT326090.1.xml").text #+end_src #+RESULTS: #+begin_example COVID-19 PubSeq Sample 2697049 Severe acute respiratory syndrome coronavirus 2 SARS-CoV-2 investigation type sequencing method http://purl.obolibrary.org/obo/OBI_0000759 collection date 2020-03-21 geographic location (latitude) DD geographic location (longitude) DD geographic location (country and/or sea) geographic location (region and locality) environment (material) http://purl.obolibrary.org/obo/NCIT_C155831 ENA-CHECKLIST ERC000011 #+end_example * Configure emacs to run tests Execute a code block with C-c C-c. You may need to set #+begin_src elisp (org-babel-do-load-languages 'org-babel-load-languages '((python . t))) (setq org-babel-python-command "python3") (setq org-babel-eval-verbose t) (setq org-confirm-babel-evaluate nil) #+end_src #+RESULTS: To skip confirmations you may also want to set : (setq org-confirm-babel-evaluate nil) To see output of the interpreter open then *Python* buffer.