about summary refs log tree commit diff
path: root/README.md
blob: b7223568de48cc96a2a4dd0bcc7c8857c718d585 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
[`pggb`](https://github.com/pangenome/pggb/) builds pangenome variation graphs from a set of input sequences. `pggb.cwl` is a port of `pggb` to the [Common Workflow Language](https://www.commonwl.org/) (CWL).

# Features

`pggb.cwl` offers:
- better parallelization, especially of the wfmash all-to-all alignment step and the various visualization steps
- more readable code
- reproducible
- portable across computing environments

# How to use

First, compile the [ccwl](https://ccwl.systemreboot.net/) sources to a CWL workflow.
```
ccwl compile -o pggb.cwl pggb.scm
```
Now, run the compiled CWL workflow using your preferred CWL implementation. For cwltool, the reference CWL implementation:
```
cwltool pggb.cwl inputs.yaml
```
[ravanan](https://forge.systemreboot.net/ravanan/) is a CWL implementation that uses [Guix](https://guix.gnu.org/) to provide robust reproducibility guarantees. To run `pggb.cwl` using ravanan:
```
ravanan --guix-channels=channels.scm --store=store pggb.cwl inputs.yaml
```
You may need to pass in more options based on the specifics of your computing environment.

# Differences

`pggb.cwl` deviates from `pggb` in the following:
- Number of haplotypes is always required. In contrast, this is optional in `pggb` if sequences follow the [PanSN-spec](https://github.com/pangenome/PanSN-spec).
- External mappers are not supported.
- `--vcf-spec` is not implemented.

# License

pggb.cwl is free software released under the terms of the [GNU General Public License](https://www.gnu.org/licenses/gpl.html), either version 3 of the License, or (at your option) any later version.