aboutsummaryrefslogtreecommitdiff
path: root/doc/blog
diff options
context:
space:
mode:
authorPjotr Prins2020-07-12 12:25:24 +0100
committerPjotr Prins2020-07-12 12:25:24 +0100
commit3dd94e87c25ff0b2942dc59c919a9e6e45fe45be (patch)
treee5bc7e6498457efc90668d7673a423e01275c9a0 /doc/blog
parentfba4474b5e2e7c069bb9158089ecb873ff8e6c5c (diff)
downloadbh20-seq-resource-3dd94e87c25ff0b2942dc59c919a9e6e45fe45be.tar.gz
bh20-seq-resource-3dd94e87c25ff0b2942dc59c919a9e6e45fe45be.tar.lz
bh20-seq-resource-3dd94e87c25ff0b2942dc59c919a9e6e45fe45be.zip
Docs: started on metadata modification
Diffstat (limited to 'doc/blog')
-rw-r--r--doc/blog/using-covid-19-pubseq-part4.html44
-rw-r--r--doc/blog/using-covid-19-pubseq-part4.org21
-rw-r--r--doc/blog/using-covid-19-pubseq-part5.html79
-rw-r--r--doc/blog/using-covid-19-pubseq-part5.org39
4 files changed, 141 insertions, 42 deletions
diff --git a/doc/blog/using-covid-19-pubseq-part4.html b/doc/blog/using-covid-19-pubseq-part4.html
index 67d299e..b5a05ca 100644
--- a/doc/blog/using-covid-19-pubseq-part4.html
+++ b/doc/blog/using-covid-19-pubseq-part4.html
@@ -3,10 +3,10 @@
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
-<!-- 2020-05-30 Sat 11:52 -->
+<!-- 2020-07-12 Sun 06:24 -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
-<title>&lrm;</title>
+<title>COVID-19 PubSeq (part 4)</title>
<meta name="generator" content="Org mode" />
<meta name="author" content="Pjotr Prins" />
<style type="text/css">
@@ -161,19 +161,6 @@
.footdef { margin-bottom: 1em; }
.figure { padding: 1em; }
.figure p { text-align: center; }
- .equation-container {
- display: table;
- text-align: center;
- width: 100%;
- }
- .equation {
- vertical-align: middle;
- }
- .equation-label {
- display: table-cell;
- text-align: right;
- vertical-align: middle;
- }
.inlinetask {
padding: 10px;
border: 2px solid gray;
@@ -193,12 +180,13 @@
.org-svg { width: 90%; }
/*]]>*/-->
</style>
+<link rel="Blog stylesheet" type="text/css" href="blog.css" />
<script type="text/javascript">
/*
@licstart The following is the entire license notice for the
JavaScript code in this tag.
-Copyright (C) 2012-2020 Free Software Foundation, Inc.
+Copyright (C) 2012-2018 Free Software Foundation, Inc.
The JavaScript code in this tag is free software: you can
redistribute it and/or modify it under the terms of the GNU
@@ -242,25 +230,41 @@ for the JavaScript code in this tag.
</head>
<body>
<div id="content">
+<h1 class="title">COVID-19 PubSeq (part 4)</h1>
<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
-<li><a href="#orgda6f48c">1. Modify Workflow</a></li>
+<li><a href="#org8f8b64a">1. What does this mean?</a></li>
+<li><a href="#orgcc7a403">2. Modify Workflow</a></li>
</ul>
</div>
</div>
-<div id="outline-container-orgda6f48c" class="outline-2">
-<h2 id="orgda6f48c"><span class="section-number-2">1</span> Modify Workflow</h2>
+
+
+<div id="outline-container-org8f8b64a" class="outline-2">
+<h2 id="org8f8b64a"><span class="section-number-2">1</span> What does this mean?</h2>
<div class="outline-text-2" id="text-1">
<p>
+This means that when someone uploads a SARS-CoV-2 sequence using one
+of our tools (CLI or web-based) they add a sequence and some metadata
+which triggers a rerun of our workflows.
+</p>
+</div>
+</div>
+
+
+<div id="outline-container-orgcc7a403" class="outline-2">
+<h2 id="orgcc7a403"><span class="section-number-2">2</span> Modify Workflow</h2>
+<div class="outline-text-2" id="text-2">
+<p>
<i>Work in progress!</i>
</p>
</div>
</div>
</div>
<div id="postamble" class="status">
-<hr><small>Created by <a href="http://thebird.nl/">Pjotr Prins</a> (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!<br />Modified 2020-05-30 Sat 11:52</small>.
+<hr><small>Created by <a href="http://thebird.nl/">Pjotr Prins</a> (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!<br />Modified 2020-07-12 Sun 06:24</small>.
</div>
</body>
</html>
diff --git a/doc/blog/using-covid-19-pubseq-part4.org b/doc/blog/using-covid-19-pubseq-part4.org
index 58a1f56..5fe71d1 100644
--- a/doc/blog/using-covid-19-pubseq-part4.org
+++ b/doc/blog/using-covid-19-pubseq-part4.org
@@ -1,3 +1,24 @@
+#+TITLE: COVID-19 PubSeq (part 4)
+#+AUTHOR: Pjotr Prins
+# C-c C-e h h publish
+# C-c ! insert date (use . for active agenda, C-u C-c ! for date, C-u C-c . for time)
+# C-c C-t task rotate
+# RSS_IMAGE_URL: http://xxxx.xxxx.free.fr/rss_icon.png
+
+#+HTML_HEAD: <link rel="Blog stylesheet" type="text/css" href="blog.css" />
+
+
+* Table of Contents :TOC:noexport:
+ - [[#what-does-this-mean][What does this mean?]]
+ - [[#modify-workflow][Modify Workflow]]
+
+* What does this mean?
+
+This means that when someone uploads a SARS-CoV-2 sequence using one
+of our tools (CLI or web-based) they add a sequence and some metadata
+which triggers a rerun of our workflows.
+
+
* Modify Workflow
/Work in progress!/
diff --git a/doc/blog/using-covid-19-pubseq-part5.html b/doc/blog/using-covid-19-pubseq-part5.html
index 30a3f83..80bf559 100644
--- a/doc/blog/using-covid-19-pubseq-part5.html
+++ b/doc/blog/using-covid-19-pubseq-part5.html
@@ -3,10 +3,10 @@
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
-<!-- 2020-05-30 Sat 11:59 -->
+<!-- 2020-07-12 Sun 06:24 -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
-<title>&lrm;</title>
+<title>COVID-19 PubSeq (part 4)</title>
<meta name="generator" content="Org mode" />
<meta name="author" content="Pjotr Prins" />
<style type="text/css">
@@ -161,19 +161,6 @@
.footdef { margin-bottom: 1em; }
.figure { padding: 1em; }
.figure p { text-align: center; }
- .equation-container {
- display: table;
- text-align: center;
- width: 100%;
- }
- .equation {
- vertical-align: middle;
- }
- .equation-label {
- display: table-cell;
- text-align: right;
- vertical-align: middle;
- }
.inlinetask {
padding: 10px;
border: 2px solid gray;
@@ -193,12 +180,13 @@
.org-svg { width: 90%; }
/*]]>*/-->
</style>
+<link rel="Blog stylesheet" type="text/css" href="blog.css" />
<script type="text/javascript">
/*
@licstart The following is the entire license notice for the
JavaScript code in this tag.
-Copyright (C) 2012-2020 Free Software Foundation, Inc.
+Copyright (C) 2012-2018 Free Software Foundation, Inc.
The JavaScript code in this tag is free software: you can
redistribute it and/or modify it under the terms of the GNU
@@ -242,16 +230,22 @@ for the JavaScript code in this tag.
</head>
<body>
<div id="content">
+<h1 class="title">COVID-19 PubSeq (part 4)</h1>
<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
-<li><a href="#org31c224e">1. Modify Metadata</a></li>
+<li><a href="#org871ad58">1. Modify Metadata</a></li>
+<li><a href="#org07e8755">2. What is the schema?</a></li>
+<li><a href="#org4857280">3. How is the website generated?</a></li>
+<li><a href="#orge709ae2">4. Modifying the schema</a></li>
</ul>
</div>
</div>
-<div id="outline-container-org31c224e" class="outline-2">
-<h2 id="org31c224e"><span class="section-number-2">1</span> Modify Metadata</h2>
+
+
+<div id="outline-container-org871ad58" class="outline-2">
+<h2 id="org871ad58"><span class="section-number-2">1</span> Modify Metadata</h2>
<div class="outline-text-2" id="text-1">
<p>
The public sequence resource uses multiple data formats listed on the
@@ -265,13 +259,56 @@ data are listed <a href="./blog?id=using-covid-19-pubseq-part1">here</a>.
<p>
In this BLOG we are going to look at the metadata entered on the
-<a href="./">COVID-19 PubSeq</a> website (or command line client).
+<a href="./">COVID-19 PubSeq</a> website (or command line client). It is important to
+understand that anyone, including you, can change that information!
+</p>
+</div>
+</div>
+
+<div id="outline-container-org07e8755" class="outline-2">
+<h2 id="org07e8755"><span class="section-number-2">2</span> What is the schema?</h2>
+<div class="outline-text-2" id="text-2">
+<p>
+The default metadata schema is listed <a href="https://github.com/arvados/bh20-seq-resource/blob/master/bh20sequploader/bh20seq-schema.yml">here</a>.
+</p>
+</div>
+</div>
+
+<div id="outline-container-org4857280" class="outline-2">
+<h2 id="org4857280"><span class="section-number-2">3</span> How is the website generated?</h2>
+<div class="outline-text-2" id="text-3">
+<p>
+Using the schema we use <a href="https://pypi.org/project/PyShEx/">pyshex</a> shex expressions and <a href="https://github.com/common-workflow-language/schema_salad">schema salad</a> to
+generate the <a href="https://github.com/arvados/bh20-seq-resource/blob/edb17e7f7caebfa1e76b21006b1772a33f4f7887/bh20simplewebuploader/templates/form.html#L47">input form</a>, <a href="https://github.com/arvados/bh20-seq-resource/blob/edb17e7f7caebfa1e76b21006b1772a33f4f7887/bh20sequploader/qc_metadata.py#L13">validate</a> the user input and to build <a href="https://github.com/arvados/bh20-seq-resource/blob/edb17e7f7caebfa1e76b21006b1772a33f4f7887/workflows/pangenome-generate/merge-metadata.py#L24">RDF</a>!
+All from that one metadata schema.
+</p>
+</div>
+</div>
+
+<div id="outline-container-orge709ae2" class="outline-2">
+<h2 id="orge709ae2"><span class="section-number-2">4</span> Modifying the schema</h2>
+<div class="outline-text-2" id="text-4">
+<p>
+One of the first things we wanted to do is to add a field for the data
+license. Initially we only support CC-4.0 as a license by default, but
+now we want to give uploaders the option to make it an even more
+liberal CC0 license. The first step is to find a good ontology term
+for the field. Searching for `creative commons cc0 rdf' rendered this
+useful <a href="https://creativecommons.org/ns">page</a>. We also find an <a href="https://wiki.creativecommons.org/wiki/CC_License_Rdf_Overview">overview</a> where CC0 is represented as URI
+<a href="https://creativecommons.org/publicdomain/zero/1.0/">https://creativecommons.org/publicdomain/zero/1.0/</a>. Meanwhile the
+attribution license <a href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</a>.
+According to this <a href="https://wiki.creativecommons.org/images/d/d6/Ccrel-1.0.pdf">document</a> we should really also add fields for
+attributionName and attributionURL.
+</p>
+
+<p>
+<i>Note: work in progress</i>
</p>
</div>
</div>
</div>
<div id="postamble" class="status">
-<hr><small>Created by <a href="http://thebird.nl/">Pjotr Prins</a> (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!<br />Modified 2020-05-30 Sat 11:59</small>.
+<hr><small>Created by <a href="http://thebird.nl/">Pjotr Prins</a> (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!<br />Modified 2020-07-12 Sun 06:24</small>.
</div>
</body>
</html>
diff --git a/doc/blog/using-covid-19-pubseq-part5.org b/doc/blog/using-covid-19-pubseq-part5.org
index 8d7504e..fe1908a 100644
--- a/doc/blog/using-covid-19-pubseq-part5.org
+++ b/doc/blog/using-covid-19-pubseq-part5.org
@@ -1,3 +1,19 @@
+#+TITLE: COVID-19 PubSeq (part 4)
+#+AUTHOR: Pjotr Prins
+# C-c C-e h h publish
+# C-c ! insert date (use . for active agenda, C-u C-c ! for date, C-u C-c . for time)
+# C-c C-t task rotate
+# RSS_IMAGE_URL: http://xxxx.xxxx.free.fr/rss_icon.png
+
+#+HTML_HEAD: <link rel="Blog stylesheet" type="text/css" href="blog.css" />
+
+
+* Table of Contents :TOC:noexport:
+ - [[#modify-metadata][Modify Metadata]]
+ - [[#what-is-the-schema][What is the schema?]]
+ - [[#how-is-the-website-generated][How is the website generated?]]
+ - [[#modifying-the-schema][Modifying the schema]]
+
* Modify Metadata
The public sequence resource uses multiple data formats listed on the
@@ -10,8 +26,29 @@ data are listed [[./blog?id=using-covid-19-pubseq-part1][here]].
In this BLOG we are going to look at the metadata entered on the
[[./][COVID-19 PubSeq]] website (or command line client). It is important to
-understand that you and us can change that information.
+understand that anyone, including you, can change that information!
* What is the schema?
+The default metadata schema is listed [[https://github.com/arvados/bh20-seq-resource/blob/master/bh20sequploader/bh20seq-schema.yml][here]].
+
* How is the website generated?
+
+Using the schema we use [[https://pypi.org/project/PyShEx/][pyshex]] shex expressions and [[https://github.com/common-workflow-language/schema_salad][schema salad]] to
+generate the [[https://github.com/arvados/bh20-seq-resource/blob/edb17e7f7caebfa1e76b21006b1772a33f4f7887/bh20simplewebuploader/templates/form.html#L47][input form]], [[https://github.com/arvados/bh20-seq-resource/blob/edb17e7f7caebfa1e76b21006b1772a33f4f7887/bh20sequploader/qc_metadata.py#L13][validate]] the user input and to build [[https://github.com/arvados/bh20-seq-resource/blob/edb17e7f7caebfa1e76b21006b1772a33f4f7887/workflows/pangenome-generate/merge-metadata.py#L24][RDF]]!
+All from that one metadata schema.
+
+* Modifying the schema
+
+One of the first things we wanted to do is to add a field for the data
+license. Initially we only support CC-4.0 as a license by default, but
+now we want to give uploaders the option to make it an even more
+liberal CC0 license. The first step is to find a good ontology term
+for the field. Searching for `creative commons cc0 rdf' rendered this
+useful [[https://creativecommons.org/ns][page]]. We also find an [[https://wiki.creativecommons.org/wiki/CC_License_Rdf_Overview][overview]] where CC0 is represented as URI
+https://creativecommons.org/publicdomain/zero/1.0/. Meanwhile the
+attribution license https://creativecommons.org/licenses/by/4.0/.
+According to this [[https://wiki.creativecommons.org/images/d/d6/Ccrel-1.0.pdf][document]] we should really also add fields for
+attributionName and attributionURL.
+
+/Note: work in progress/