diff options
author | Pjotr Prins | 2020-07-15 11:43:35 +0100 |
---|---|---|
committer | GitHub | 2020-07-15 11:43:35 +0100 |
commit | a0c8ebd57b875f265e8b0efec4abfaf892eb6c45 (patch) | |
tree | b5477179d66540ab25634295112a2df47df30e27 /doc | |
parent | 3dd94e87c25ff0b2942dc59c919a9e6e45fe45be (diff) | |
parent | b5e38b960c380f0f7868d8fc4038ea3c3a0c52ee (diff) | |
download | bh20-seq-resource-a0c8ebd57b875f265e8b0efec4abfaf892eb6c45.tar.gz bh20-seq-resource-a0c8ebd57b875f265e8b0efec4abfaf892eb6c45.tar.lz bh20-seq-resource-a0c8ebd57b875f265e8b0efec4abfaf892eb6c45.zip |
Merge pull request #97 from pjotrp/master
Add license metadata to record
Diffstat (limited to 'doc')
-rw-r--r-- | doc/blog/using-covid-19-pubseq-part5.org | 68 | ||||
-rw-r--r-- | doc/web/about.html | 143 |
2 files changed, 135 insertions, 76 deletions
diff --git a/doc/blog/using-covid-19-pubseq-part5.org b/doc/blog/using-covid-19-pubseq-part5.org index fe1908a..4b0ea64 100644 --- a/doc/blog/using-covid-19-pubseq-part5.org +++ b/doc/blog/using-covid-19-pubseq-part5.org @@ -40,7 +40,7 @@ All from that one metadata schema. * Modifying the schema -One of the first things we wanted to do is to add a field for the data +One of the first things we want to do is to add a field for the data license. Initially we only support CC-4.0 as a license by default, but now we want to give uploaders the option to make it an even more liberal CC0 license. The first step is to find a good ontology term @@ -51,4 +51,70 @@ attribution license https://creativecommons.org/licenses/by/4.0/. According to this [[https://wiki.creativecommons.org/images/d/d6/Ccrel-1.0.pdf][document]] we should really also add fields for attributionName and attributionURL. +A minimal triple should be + +: id xhtml:license <http://creativecommons.org/licenses/by/4.0/> . + +Other suggestions are + +: id dc:title "Description" . +: id cc:attributionName "Your Name" . +: id cc:attributionURL <http://resource.org/id> + +and 'dc:source' which indicates the original source of any modified +work, specified as a URI. +The prefix 'cc:' is an abbreviation for http://creativecommons.org/ns#. + +Going back to the schema, where does it fit? Under host, sample, +virus, technology or submitter block? It could fit under sample, but +actually the license concerns the whole metadata block and sequence, +so I think we can fit under its own license tag. For example + + +id: placeholder + +: license: +: license_type: http://creativecommons.org/licenses/by/4.0/ +: attribution_title: "Sample ID" +: attribution_name: "John doe, Joe Boe, Jonny Oe" +: attribution_url: http://covid19.genenetwork.org/id +: attribution_source: https://www.ncbi.nlm.nih.gov/pubmed/323088888 + +So, let's update the example. Notice the license info is optional - if it is missing +we just assume the default CC-4.0. + +One thing that is interesting is that in the name space https://creativecommons.org/ns there +is no mention of a title. I think it is useful, however, because we have no such field. +So, we'll add it simply as a title field. Now the draft schema is + +#+BEGIN_SRC js +- name: licenseSchema + type: record + fields: + license_type: + doc: License types as refined in https://wiki.creativecommons.org/images/d/d6/Ccrel-1.0.pdf + type: string? + jsonldPredicate: + _id: https://creativecommons.org/ns#License + title: + doc: Attribution title related to license + type: string? + jsonldPredicate: + _id: http://semanticscience.org/resource/SIO_001167 + attribution_url: + doc: Attribution URL related to license + type: string? + jsonldPredicate: + _id: https://creativecommons.org/ns#Work + attribution_source: + doc: Attribution source URL + type: string? + jsonldPredicate: + _id: https://creativecommons.org/ns#Work +#+END_SRC + +Now, we are no ontology experts, right? So, next we submit a patch to our source tree and +ask for feedback before wiring it up in the data entry form. The pull request was +submitted here FIXME. + /Note: work in progress/ diff --git a/doc/web/about.html b/doc/web/about.html index c907e6c..9b16c92 100644 --- a/doc/web/about.html +++ b/doc/web/about.html @@ -3,7 +3,7 @@ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <head> -<!-- 2020-05-29 Fri 08:27 --> +<!-- 2020-07-12 Sun 06:29 --> <meta http-equiv="Content-Type" content="text/html;charset=utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1" /> <title>About/FAQ</title> @@ -161,19 +161,6 @@ .footdef { margin-bottom: 1em; } .figure { padding: 1em; } .figure p { text-align: center; } - .equation-container { - display: table; - text-align: center; - width: 100%; - } - .equation { - vertical-align: middle; - } - .equation-label { - display: table-cell; - text-align: right; - vertical-align: middle; - } .inlinetask { padding: 10px; border: 2px solid gray; @@ -198,7 +185,7 @@ @licstart The following is the entire license notice for the JavaScript code in this tag. -Copyright (C) 2012-2020 Free Software Foundation, Inc. +Copyright (C) 2012-2018 Free Software Foundation, Inc. The JavaScript code in this tag is free software: you can redistribute it and/or modify it under the terms of the GNU @@ -247,29 +234,29 @@ for the JavaScript code in this tag. <h2>Table of Contents</h2> <div id="text-table-of-contents"> <ul> -<li><a href="#org783b5e9">1. What is the 'public sequence resource' about?</a></li> -<li><a href="#org2c0bcfd">2. Who created the public sequence resource?</a></li> -<li><a href="#org34070d3">3. How does the public sequence resource compare to other data resources?</a></li> -<li><a href="#org64a9493">4. Why should I upload my data here?</a></li> -<li><a href="#orgf898e7f">5. Why should I not upload by data here?</a></li> -<li><a href="#org828e164">6. How does the public sequence resource work?</a></li> -<li><a href="#org7b0d03f">7. Who uses the public sequence resource?</a></li> -<li><a href="#org31aaf23">8. Is this about open data?</a></li> -<li><a href="#orgb376b6c">9. Is this about free software?</a></li> -<li><a href="#orgf19cd96">10. How do I upload raw data?</a></li> -<li><a href="#orgebfed00">11. How do I change metadata?</a></li> -<li><a href="#orge2aecf8">12. How do I change the work flows?</a></li> -<li><a href="#orgd45b3bc">13. How do I change the source code?</a></li> -<li><a href="#org2bb9455">14. Should I choose CC-BY or CC0?</a></li> -<li><a href="#org62bf23f">15. How do I deal with private data and privacy?</a></li> -<li><a href="#org40c6da0">16. How do I communicate with you?</a></li> -<li><a href="#org1f27c44">17. Who are the sponsors?</a></li> +<li><a href="#orgac6ad8b">1. What is the 'public sequence resource' about?</a></li> +<li><a href="#org0c21c2e">2. Who created the public sequence resource?</a></li> +<li><a href="#org3fb8cb3">3. How does the public sequence resource compare to other data resources?</a></li> +<li><a href="#org6cd9ea2">4. Why should I upload my data here?</a></li> +<li><a href="#org0b6e3fb">5. Why should I not upload by data here?</a></li> +<li><a href="#org3eb3a4e">6. How does the public sequence resource work?</a></li> +<li><a href="#org7a397f5">7. Who uses the public sequence resource?</a></li> +<li><a href="#org92cb008">8. Is this about open data?</a></li> +<li><a href="#org232d6fa">9. Is this about free software?</a></li> +<li><a href="#orgd93869f">10. How do I upload raw data?</a></li> +<li><a href="#org88e8b0a">11. How do I change metadata?</a></li> +<li><a href="#orgd04b8f8">12. How do I change the work flows?</a></li> +<li><a href="#org5d1ee05">13. How do I change the source code?</a></li> +<li><a href="#orgae6461b">14. Should I choose CC-BY or CC0?</a></li> +<li><a href="#org3ea90a9">15. How do I deal with private data and privacy?</a></li> +<li><a href="#org7ff7106">16. How do I communicate with you?</a></li> +<li><a href="#org9566fa7">17. Who are the sponsors?</a></li> </ul> </div> </div> -<div id="outline-container-org783b5e9" class="outline-2"> -<h2 id="org783b5e9"><span class="section-number-2">1</span> What is the 'public sequence resource' about?</h2> +<div id="outline-container-orgac6ad8b" class="outline-2"> +<h2 id="orgac6ad8b"><span class="section-number-2">1</span> What is the 'public sequence resource' about?</h2> <div class="outline-text-2" id="text-1"> <p> The <b>public sequence resource</b> aims to provide a generic and useful @@ -280,17 +267,18 @@ sequence comparison and protein prediction. </div> </div> -<div id="outline-container-org2c0bcfd" class="outline-2"> -<h2 id="org2c0bcfd"><span class="section-number-2">2</span> Who created the public sequence resource?</h2> +<div id="outline-container-org0c21c2e" class="outline-2"> +<h2 id="org0c21c2e"><span class="section-number-2">2</span> Who created the public sequence resource?</h2> <div class="outline-text-2" id="text-2"> <p> The <b>public sequence resource</b> is an initiative by <a href="https://github.com/arvados/bh20-seq-resource/graphs/contributors">bioinformatics</a> and ontology experts who want to create something agile and useful for the wider research community. The initiative started at the COVID-19 biohackathon in April 2020 and is ongoing. The main project drivers -are Pjotr Prins (UTHSC), Peter Amstutz (Curii), Michael Crusoe (Common -Workflow Language), Thomas Liener (consultant, formerly EBI) and -Jerven Bolleman (Swiss Institute of Bioinformatics). +are Pjotr Prins (UTHSC), Peter Amstutz (Curii), Andrea Guarracino +(University of Rome Tor Vergata), Michael Crusoe (Common Workflow +Language), Thomas Liener (consultant, formerly EBI), Erik Garrison +(UCSC) and Jerven Bolleman (Swiss Institute of Bioinformatics). </p> <p> @@ -301,8 +289,8 @@ wrangling experts. Thank you everyone! </div> </div> -<div id="outline-container-org34070d3" class="outline-2"> -<h2 id="org34070d3"><span class="section-number-2">3</span> How does the public sequence resource compare to other data resources?</h2> +<div id="outline-container-org3fb8cb3" class="outline-2"> +<h2 id="org3fb8cb3"><span class="section-number-2">3</span> How does the public sequence resource compare to other data resources?</h2> <div class="outline-text-2" id="text-3"> <p> The short version is that we use state-of-the-art practices in @@ -312,17 +300,18 @@ to building out this resource! </p> <p> -Importantly: all data is published under the <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons 4.0 -attribution license</a> which means it data can be published and workflows -can run in public environments allowing for improved access for -research and reproducible results. This contrasts with some other -public resources, including GISAID. +Importantly: all data is published under either the <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons +4.0 attribution license</a> or the <a href="https://creativecommons.org/share-your-work/public-domain/cc0/">CC0 “No Rights Reserved” license</a> which +means it data can be published and workflows can run in public +environments allowing for improved access for research and +reproducible results. This contrasts with some other public resources, +including GISAID. </p> </div> </div> -<div id="outline-container-org64a9493" class="outline-2"> -<h2 id="org64a9493"><span class="section-number-2">4</span> Why should I upload my data here?</h2> +<div id="outline-container-org6cd9ea2" class="outline-2"> +<h2 id="org6cd9ea2"><span class="section-number-2">4</span> Why should I upload my data here?</h2> <div class="outline-text-2" id="text-4"> <ol class="org-ol"> <li>We champion truly shareable data without licensing restrictions - with proper @@ -353,8 +342,8 @@ multiple resources. </div> </div> -<div id="outline-container-orgf898e7f" class="outline-2"> -<h2 id="orgf898e7f"><span class="section-number-2">5</span> Why should I not upload by data here?</h2> +<div id="outline-container-org0b6e3fb" class="outline-2"> +<h2 id="org0b6e3fb"><span class="section-number-2">5</span> Why should I not upload by data here?</h2> <div class="outline-text-2" id="text-5"> <p> Funny question. There are only good reasons to upload your data here @@ -376,8 +365,8 @@ for bulk uploads! </div> </div> -<div id="outline-container-org828e164" class="outline-2"> -<h2 id="org828e164"><span class="section-number-2">6</span> How does the public sequence resource work?</h2> +<div id="outline-container-org3eb3a4e" class="outline-2"> +<h2 id="org3eb3a4e"><span class="section-number-2">6</span> How does the public sequence resource work?</h2> <div class="outline-text-2" id="text-6"> <p> On uploading a sequence with metadata it will automatically be @@ -388,8 +377,8 @@ using workflows from the High Performance Open Biology Lab defined </div> </div> -<div id="outline-container-org7b0d03f" class="outline-2"> -<h2 id="org7b0d03f"><span class="section-number-2">7</span> Who uses the public sequence resource?</h2> +<div id="outline-container-org7a397f5" class="outline-2"> +<h2 id="org7a397f5"><span class="section-number-2">7</span> Who uses the public sequence resource?</h2> <div class="outline-text-2" id="text-7"> <p> The Swiss Institute of Bioinformatics has included this data in @@ -397,14 +386,18 @@ The Swiss Institute of Bioinformatics has included this data in </p> <p> +The Pantograph <a href="https://graph-genome.github.io/">viewer</a> uses PubSeq data for their visualisations. +</p> + +<p> <a href="https://uthsc.edu">UTHSC</a> and <a href="https://www.ornl.gov/news/ornl-fight-against-covid-19">ORNL</a> use COVID-19 PubSeq data for protein prediction and drug development. </p> </div> </div> -<div id="outline-container-org31aaf23" class="outline-2"> -<h2 id="org31aaf23"><span class="section-number-2">8</span> Is this about open data?</h2> +<div id="outline-container-org92cb008" class="outline-2"> +<h2 id="org92cb008"><span class="section-number-2">8</span> Is this about open data?</h2> <div class="outline-text-2" id="text-8"> <p> All data is published under a <a href="https://creativecommons.org/licenses/by/4.0/">Creative Commons 4.0 attribution license</a> @@ -414,8 +407,8 @@ data and store it for further processing. </div> </div> -<div id="outline-container-orgb376b6c" class="outline-2"> -<h2 id="orgb376b6c"><span class="section-number-2">9</span> Is this about free software?</h2> +<div id="outline-container-org232d6fa" class="outline-2"> +<h2 id="org232d6fa"><span class="section-number-2">9</span> Is this about free software?</h2> <div class="outline-text-2" id="text-9"> <p> Absolutely. Free software allows for fully reproducible pipelines. You @@ -424,8 +417,8 @@ can take our workflows and data and run it elsewhere! </div> </div> -<div id="outline-container-orgf19cd96" class="outline-2"> -<h2 id="orgf19cd96"><span class="section-number-2">10</span> How do I upload raw data?</h2> +<div id="outline-container-orgd93869f" class="outline-2"> +<h2 id="orgd93869f"><span class="section-number-2">10</span> How do I upload raw data?</h2> <div class="outline-text-2" id="text-10"> <p> We are preparing raw sequence data pipelines (fastq and BAM). The @@ -440,8 +433,8 @@ assembly variations into consideration. This is all work in progress. </div> </div> -<div id="outline-container-orgebfed00" class="outline-2"> -<h2 id="orgebfed00"><span class="section-number-2">11</span> How do I change metadata?</h2> +<div id="outline-container-org88e8b0a" class="outline-2"> +<h2 id="org88e8b0a"><span class="section-number-2">11</span> How do I change metadata?</h2> <div class="outline-text-2" id="text-11"> <p> See the <a href="http://covid19.genenetwork.org/blog">http://covid19.genenetwork.org/blog</a>! @@ -449,8 +442,8 @@ See the <a href="http://covid19.genenetwork.org/blog">http://covid19.genenetwork </div> </div> -<div id="outline-container-orge2aecf8" class="outline-2"> -<h2 id="orge2aecf8"><span class="section-number-2">12</span> How do I change the work flows?</h2> +<div id="outline-container-orgd04b8f8" class="outline-2"> +<h2 id="orgd04b8f8"><span class="section-number-2">12</span> How do I change the work flows?</h2> <div class="outline-text-2" id="text-12"> <p> See the <a href="http://covid19.genenetwork.org/blog">http://covid19.genenetwork.org/blog</a>! @@ -458,8 +451,8 @@ See the <a href="http://covid19.genenetwork.org/blog">http://covid19.genenetwork </div> </div> -<div id="outline-container-orgd45b3bc" class="outline-2"> -<h2 id="orgd45b3bc"><span class="section-number-2">13</span> How do I change the source code?</h2> +<div id="outline-container-org5d1ee05" class="outline-2"> +<h2 id="org5d1ee05"><span class="section-number-2">13</span> How do I change the source code?</h2> <div class="outline-text-2" id="text-13"> <p> Go to our <a href="https://github.com/arvados/bh20-seq-resource">source code repositories</a>, fork/clone the repository, change @@ -469,8 +462,8 @@ many PRs we already merged. </div> </div> -<div id="outline-container-org2bb9455" class="outline-2"> -<h2 id="org2bb9455"><span class="section-number-2">14</span> Should I choose CC-BY or CC0?</h2> +<div id="outline-container-orgae6461b" class="outline-2"> +<h2 id="orgae6461b"><span class="section-number-2">14</span> Should I choose CC-BY or CC0?</h2> <div class="outline-text-2" id="text-14"> <p> Restrictive data licenses are hampering data sharing and reproducible @@ -486,8 +479,8 @@ In all honesty: we prefer both data and software to be free. </div> </div> -<div id="outline-container-org62bf23f" class="outline-2"> -<h2 id="org62bf23f"><span class="section-number-2">15</span> How do I deal with private data and privacy?</h2> +<div id="outline-container-org3ea90a9" class="outline-2"> +<h2 id="org3ea90a9"><span class="section-number-2">15</span> How do I deal with private data and privacy?</h2> <div class="outline-text-2" id="text-15"> <p> A public sequence resource is about public data. Metadata can refer to @@ -498,8 +491,8 @@ plan to combine identifiers with clinical data stored securely at </div> </div> -<div id="outline-container-org40c6da0" class="outline-2"> -<h2 id="org40c6da0"><span class="section-number-2">16</span> How do I communicate with you?</h2> +<div id="outline-container-org7ff7106" class="outline-2"> +<h2 id="org7ff7106"><span class="section-number-2">16</span> How do I communicate with you?</h2> <div class="outline-text-2" id="text-16"> <p> We use a <a href="https://gitter.im/arvados/pubseq?utm_source=share-link&utm_medium=link&utm_campaign=share-link">gitter channel</a> you can join. @@ -507,8 +500,8 @@ We use a <a href="https://gitter.im/arvados/pubseq?utm_source=share-link&utm </div> </div> -<div id="outline-container-org1f27c44" class="outline-2"> -<h2 id="org1f27c44"><span class="section-number-2">17</span> Who are the sponsors?</h2> +<div id="outline-container-org9566fa7" class="outline-2"> +<h2 id="org9566fa7"><span class="section-number-2">17</span> Who are the sponsors?</h2> <div class="outline-text-2" id="text-17"> <p> The main sponsors are listed in the footer. In addition to the time @@ -519,7 +512,7 @@ for donating COVID-19 related compute time. </div> </div> <div id="postamble" class="status"> -<hr><small>Created by <a href="http://thebird.nl/">Pjotr Prins</a> (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!<br />Modified 2020-05-29 Fri 08:26</small>. +<hr><small>Created by <a href="http://thebird.nl/">Pjotr Prins</a> (pjotr.public768 at thebird 'dot' nl) using Emacs org-mode and a healthy dose of Lisp!<br />Modified 2020-07-12 Sun 04:54</small>. </div> </body> </html> |