aboutsummaryrefslogtreecommitdiff
path: root/workflows
AgeCommit message (Collapse)Author
2021-01-03genbank: more or less complete. Need to add collection methodPjotr Prins
2021-01-03genbank: deal with host, sex and agePjotr Prins
2021-01-03genbank: technology parsingPjotr Prins
2021-01-03genbank: submitter infoPjotr Prins
2021-01-03genbank: get authorsPjotr Prins
2021-01-03Move reference code to different file so it does not break pythonPjotr Prins
2021-01-02GenBank date parsingPjotr Prins
2021-01-02transform-genbank-xml2yamlfa.py refactoringPjotr Prins
2021-01-02transform-genbank-xml2yamlfa.py rewritePjotr Prins
2021-01-01genbank: minor fixesPjotr Prins
2021-01-01gzip outputPjotr Prins
2021-01-01update-from-genbank.pyPjotr Prins
2021-01-01genbank-fetch-ids.pyPjotr Prins
2021-01-01genbank-fetch-idsPjotr Prins
2021-01-01genbank: cleaning upPjotr Prins
2021-01-01genbank-fetch-ids simple callPjotr Prins
2021-01-01sparql: make use of pattern matchingPjotr Prins
2020-12-31Add commentPjotr Prins
2020-12-31Improve SPARQL query and commentsPjotr Prins
2020-12-31genbank: sparql-fetch-idsPjotr Prins
2020-12-31sparql: rename filePjotr Prins
2020-12-31genbank: started on SPARQL fetcherPjotr Prins
2020-12-31genbank: pseudo workflowPjotr Prins
2020-12-31genbank: headerPjotr Prins
2020-12-31genbank: split scriptPjotr Prins
2020-12-31genbank: moving script into workflow spacePjotr Prins
2020-11-21abPOA works better starting from shorter sequencespangenome_workflow_abpoaAndreaGuarracino
2020-11-21added abPOA workflow; typosAndreaGuarracino
2020-11-21added reversed_sorting parameter; typosAndreaGuarracino
2020-11-21generalized spoa workflowAndreaGuarracino
2020-11-18Give from_sparql more keep cache.Peter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-11-18Fix typo. Give from_sparql more RAM.Peter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-11-18Add query-to-gfa workflowPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-11-11Make collect-seqs skip bad inputs.Peter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-11-10Use arvados uuids for RDF subjects.uuid-for-resourcePeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-11-09Make resource link work for both portable data hashes and sample idPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-11-09Make it so "pangenome analysis" only runs collect-seqs.Peter Amstutz
Will ensure that metadata is kept up to date. GFA isn't being generated. Will introduce new workflow that uses from_sparql to analyze a subset. Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-11-09Rename schema param to metadataSchemaPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-11-09Extract subset of the all-sequences fasta by running a sparql query.Peter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-09-26script for processing the metadata of the ESR samples; moved ↵AndreaGuarracino
delete_entries_on_arvados script in scripts directory
2020-09-05increased the quality filter thresholdAndreaGuarracino
2020-08-28added script to remove entries on ArvadosAndreaGuarracino
2020-08-26Increase RAM for odgi-build-from-spoa-gfaPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-08-25Increase RAM requirement for sort_fasta_by_quality_and_lenPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-08-19Fix output parametersPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-08-19Scaling pangenome generationPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-08-19Consolidate steps to scale graph generation workflowPeter Amstutz
Arvados-DCO-1.1-Signed-off-by: Peter Amstutz <peter.amstutz@curii.com>
2020-08-19used builtin hashlib md5 for the deduplication stepAndreaGuarracino
2020-08-19integrated the deduplication step in the sorting by quality and length scriptAndreaGuarracino
2020-07-27added workflow to sort a multifasta by quality and length, and added the ↵AndreaGuarracino
overall new pangenome generation workflow with SPOA