pggb builds pangenome variation graphs from a set of input sequences. pggb.cwl is a port of pggb to the Common Workflow Language (CWL).
Features
pggb.cwl offers:
- better parallelization, especially of the wfmash all-to-all alignment step and the various visualization steps
- more readable code
- reproducible
- portable across computing environments
How to use
First, compile the ccwl sources to a CWL workflow.
ccwl compile -o pggb.cwl pggb.scm
Now, run the compiled CWL workflow using your preferred CWL implementation. For cwltool, the reference CWL implementation:
cwltool pggb.cwl inputs.yaml
ravanan is a CWL implementation that uses Guix to provide robust reproducibility guarantees. To run pggb.cwl using ravanan:
ravanan --guix-channels=channels.scm --store=store pggb.cwl inputs.yaml
You may need to pass in more options based on the specifics of your computing environment.
Differences
pggb.cwl deviates from pggb in the following:
- Number of haplotypes is always required. In contrast, this is optional in pggb if sequences follow the PanSN-spec.
- External mappers are not supported.
- --vcf-spec is not implemented.
License
pggb.cwl is free software released under the terms of the GNU General Public License, either version 3 of the License, or (at your option) any later version.
