about summary refs log tree commit diff
diff options
context:
space:
mode:
-rw-r--r--.guix/hsmice-test.scm4
-rw-r--r--README.md15
-rw-r--r--e2e-tests/hsmice/check-qtl.py2
3 files changed, 12 insertions, 9 deletions
diff --git a/.guix/hsmice-test.scm b/.guix/hsmice-test.scm
index ee8ecbd..9137dc6 100644
--- a/.guix/hsmice-test.scm
+++ b/.guix/hsmice-test.scm
@@ -49,7 +49,7 @@
 (define-public r-mixed-model-gwas
   (package
    (name "r-mixed-model-gwas")
-   (version "1.3")
+   (version "1.3.1")
    (source (origin
             (method git-fetch)
             (uri (git-reference
@@ -58,7 +58,7 @@
             (file-name (git-file-name name version))
             (sha256
              (base32
-              "0yv86mw9m981vzl80j100lg05kc6jm5ijhq9b8zcd8f2lr3115db"))))
+              "0vll55v8wjc0179n5q9ch9ah3dvgymc374wlbz33yzyi35yr8ds2"))))
    (build-system r-build-system)
    (home-page "https://github.com/encryption4genetics/mixed-model-gwas")
    (synopsis "R mixed model GWAS")
diff --git a/README.md b/README.md
index 2b3ab5d..de93fd2 100644
--- a/README.md
+++ b/README.md
@@ -63,11 +63,11 @@ pyhegp --help
 
 ![Simple data sharing workflow](doc/simple-workflow.png)
 
-In this simple scenario, there is only one data owner and they wish to share their encrypted data with a researcher. The data owner encrypts their data with:
+In this simple scenario, there is only one data owner and they wish to share their encrypted data with a researcher. The data owner encrypts their genotype and phenotype data with:
 ```
-pyhegp encrypt genotype.tsv
+pyhegp encrypt genotype.tsv phenotype.tsv
 ```
-They then send the encrypted data `genotype.tsv.hegp` to the researcher. Note that data sharing is carried out-of-band and is outside the scope of `pyhegp`.
+They then send the encrypted `genotype.tsv.hegp` and `phenotype.tsv.hegp` to the researcher. Note that data sharing is carried out-of-band and is outside the scope of `pyhegp`.
 
 ## Joint/federated analysis with many data owners
 
@@ -77,17 +77,18 @@ Data owners generate summary statistics for their data.
 ```
 pyhegp summary genotype.tsv -o summary
 ```
-They share this with the data broker who pools it to compute the summary statistics of the complete dataset.
+They share this with the data broker who pools it to compute the summary statistics of the complete dataset. Any SNPs not common to all summaries will be dropped.
 ```
 pyhegp pool -o complete-summary summary1 summary2 ...
 ```
-The data broker shares these summary statistics with the data owners. The data owners standardize their data using these summary statistics, and encrypt their data using a random key.
+The data broker shares these summary statistics with the data owners. The data owners standardize their data using these summary statistics, and encrypt their genotype and phenotype data using a random key. Any SNPs not in `complete-summary` or have a zero standard deviation are dropped. SNPs with a zero standard deviation have no discriminatory power in the analysis.
 ```
-pyhegp encrypt -s complete-summary genotype.tsv
+pyhegp encrypt -s complete-summary genotype.tsv phenotype.tsv
 ```
-Finally, the data owners share the encrypted data `genotype.tsv.hegp` with the broker who concatenates it and shares it with all parties.
+Finally, the data owners share the encrypted `genotype.tsv.hegp` and `phenotype.tsv.hegp` with the broker who concatenates it and shares it with all parties.
 ```
 pyhegp cat-genotype -o complete-genotype.tsv.hegp genotype1.tsv.hegp genotype2.tsv.hegp ...
+pyhegp cat-phenotype -o complete-phenotype.tsv.hegp phenotype1.tsv.hegp phenotype2.tsv.hegp ...
 ```
 Note that all data sharing is carried out-of-band and is outside the scope of `pyhegp`.
 
diff --git a/e2e-tests/hsmice/check-qtl.py b/e2e-tests/hsmice/check-qtl.py
index feae361..6e342b1 100644
--- a/e2e-tests/hsmice/check-qtl.py
+++ b/e2e-tests/hsmice/check-qtl.py
@@ -23,5 +23,7 @@ import pandas as pd
 if __name__ == "__main__":
     df = pd.read_csv(sys.argv[1], sep="\t")
     qtl = df.query("p < 1e-10")
+    # Assert that the QTL is on chromosome 4.
     assert (qtl.chromosome == 4).all()
+    # Assert that the QTL is within 2 Mb of the expected position.
     assert ((qtl.position - 137715608).abs() < 2*10**6).all()