diff options
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/README.md b/README.md index 399b39a..c54de0e 100644 --- a/README.md +++ b/README.md @@ -61,7 +61,7 @@ pyhegp --help # How to use ## Simple data sharing - + In this simple scenario, there is only one data owner and they wish to share their encrypted data with a researcher. The data owner encrypts their genotype and phenotype data with: ``` @@ -71,17 +71,17 @@ They then send the encrypted `genotype.tsv.hegp` and `phenotype.tsv.hegp` to the ## Joint/federated analysis with many data owners - + Data owners generate summary statistics for their data. ``` pyhegp summary genotype.tsv -o summary ``` -They share this with the data broker who pools it to compute the summary statistics of the complete dataset. +They share this with the data broker who pools it to compute the summary statistics of the complete dataset. Any SNPs not common to all summaries will be dropped. ``` pyhegp pool -o complete-summary summary1 summary2 ... ``` -The data broker shares these summary statistics with the data owners. The data owners standardize their data using these summary statistics, and encrypt their genotype and phenotype data using a random key. +The data broker shares these summary statistics with the data owners. The data owners standardize their data using these summary statistics, and encrypt their genotype and phenotype data using a random key. Any SNPs not in `complete-summary` or have a zero standard deviation are dropped. SNPs with a zero standard deviation have no discriminatory power in the analysis. ``` pyhegp encrypt -s complete-summary genotype.tsv phenotype.tsv ``` |