From d270d35cbfe9bc94d1bef16a63e3ca89e87e739e Mon Sep 17 00:00:00 2001 From: Arun Isaac Date: Thu, 17 Jul 2025 17:59:17 +0100 Subject: Document usage instructions and workflow. * doc/workflow.uml, doc/workflow.png, doc/generate-images.sh: New files. * README.md (How to use): New section. --- README.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) (limited to 'README.md') diff --git a/README.md b/README.md index 1396f0f..1130c7a 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,28 @@ Install the development version of pyhegp into the virtual environment. pip install git+https://github.com/encryption4genetics/pyhegp ``` +# How to use + +![Workflow](doc/workflow.png) + +Data owners generate summary statistics for their data. +``` +pyhegp summary genotype.csv -o summary.txt +``` +They share this with the data broker who pools it to compute the summary statistics of the complete dataset. +``` +pyhegp pool -o complete-summary.txt summary1.txt summary2.txt ... +``` +The data broker shares these summary statistics with the data owners. The data owners standardize their data using these summary statistics, and encrypt their data using a random key. +``` +pyhegp encrypt -s complete-summary.txt -o encrypted-genotype.csv genotype.csv +``` +Finally, the data owners share the encrypted data with the broker who concatenates it and shares it with all parties. +``` +pyhegp cat -o complete-encrypted-genotype.csv encrypted-genotype1.csv encrypted-genotype2.csv ... +``` +Note that all data sharing is carried out-of-band and is outside the scope of `pyhegp`. + # Run tests Run the test suite using -- cgit v1.2.3