From 8cf4fee8900e7b146f768791fb7909b334737297 Mon Sep 17 00:00:00 2001
From: Pjotr Prins
Date: Tue, 14 Jul 2020 11:29:44 +0100
Subject: Started documenting EBI submission
---
doc/blog/using-covid-19-pubseq-part6.org | 96 ++++++++++++++++++++++++++++++++
1 file changed, 96 insertions(+)
create mode 100644 doc/blog/using-covid-19-pubseq-part6.org
(limited to 'doc')
diff --git a/doc/blog/using-covid-19-pubseq-part6.org b/doc/blog/using-covid-19-pubseq-part6.org
new file mode 100644
index 0000000..2a7c593
--- /dev/null
+++ b/doc/blog/using-covid-19-pubseq-part6.org
@@ -0,0 +1,96 @@
+#+TITLE: COVID-19 PubSeq (part 6)
+#+AUTHOR: Pjotr Prins
+# C-c C-e h h publish
+# C-c ! insert date (use . for active agenda, C-u C-c ! for date, C-u C-c . for time)
+# C-c C-t task rotate
+# RSS_IMAGE_URL: http://xxxx.xxxx.free.fr/rss_icon.png
+
+#+HTML_HEAD:
+
+
+* Table of Contents :TOC:noexport:
+ - [[#generating-output-for-ebi][Generating output for EBI]]
+ - [[#defining-the-ebi-study][Defining the EBI study]]
+ - [[#define-the-ebi-sample][Define the EBI sample]]
+ - [[#define-the-ebi-sequence][Define the EBI sequence]]
+
+* Generating output for EBI
+
+Would it not be great an uploader to PubSeq also can export samples
+to, say, EBI? That is what we discuss in this section. The submission
+process is somewhat laborious and when you have submitted to PubSeq
+why not export the same to EBI too with the least amount of effort?
+
+COVID-19 PubSeq is a data source - both sequence data and metadata -
+that can be used to push data to other sources, such as EBI. You can
+register [[https://ena-docs.readthedocs.io/en/latest/submit/samples/programmatic.html][samples programmatically]] with a specific XML interface.
+
+EBI sequence resources are presented through ENA. For example
+[[https://www.ebi.ac.uk/ena/browser/view/MT394864][Sequence: MT394864.1]].
+
+EBI has XML Formats for
+
+- SUBMISSION
+- STUDY
+- SAMPLE
+- EXPERIMENT
+- RUN
+- ANALYSIS
+- DAC
+- POLICY
+- DATASET
+- PROJECT
+
+with the schemas listed [[ftp://ftp.ebi.ac.uk/pub/databases/ena/doc/xsd/sra_1_5/][here]]. Since we are submitting sequences we
+should follow submitting [[https://ena-docs.readthedocs.io/en/latest/submit/assembly.html][full genome assembly guidelines]] and [[https://ena-docs.readthedocs.io/en/latest/submit/general-guide/programmatic.html][ENA
+guidelines]]. The first step is to define the study, next the sample and
+finally the sequence (assembly).
+
+* Defining the EBI study
+
+A study is defined [[https://ena-docs.readthedocs.io/en/latest/submit/study/programmatic.html][here]] and looks like
+
+#+BEGIN_SRC xml
+
+
+ Sequencing SARS-CoV-2 in the Washington DC area
+ This study collects samples from COVID-19 patients in the Washington DC area
+
+
+
+
+
+#+END_SRC
+
+also a submission 'command' is required looking like
+
+#+BEGIN_SRC xml
+
+
+
+
+
+
+
+
+
+
+
+#+END_SRC
+
+The webin system accepts such sources using a command like
+
+: curl -u username:password -F "SUBMISSION=@submission.xml" -F "PROJECT=@project.xml" "https://wwwdev.ebi.ac.uk/ena/submit/drop-box/submit/"
+
+as described [[https://ena-docs.readthedocs.io/en/latest/submit/study/programmatic.html#submit-the-xmls-using-curl][here]].
+
+/work in progress (WIP)/
+
+* Define the EBI sample
+
+
+/work in progress (WIP)/
+
+* Define the EBI sequence
+
+/work in progress (WIP)/
--
cgit v1.2.3