| Age | Commit message (Collapse) | Author | 
|---|
|  | read_csv can incorrectly infer that the string "00" is the integer 0.
To avoid this ambiguity, pass the correct dtype to read_csv. | 
|  |  | 
|  |  | 
|  | Remove comments mentioning logging.
Command-line error messages have their own place; they are not the
same as logging. | 
|  | We were testing for zero exit status. Now, in addition, we test for
the existence of output files. This is slightly more robust. | 
|  |  | 
|  |  | 
|  | End users who install pyhegp via pip cannot run the test suite.
Clarify this in the README. Perhaps, in the future, we should move
these developer-oriented instructions to a separate document. | 
|  | If not separated, GitHub combines the table of contents with the list
of papers in the introduction. | 
|  | A table of contents gives people a brief overview of what's in the
README, and allows them to jump to the section they are interested in. | 
|  |  | 
|  |  | 
|  | Readers are more likely to follow through to the file formats
documentation if there is a link. | 
|  |  | 
|  |  | 
|  |  | 
|  | Reducing precision lowers the file size and makes the files more
human-comprehensible. | 
|  |  | 
|  | Not everyone may want to create a virtual environment. For example, on
some HPC machines, creating a virtual environment is complicated or
does not work. | 
|  | We have not exposed a Python library interface, and it is not clear if
we need to. We can revisit this decision later, if need be. | 
|  | * pyhegp/pyhegp.py: Import reduce from functools.
(pool_summaries, encrypt_genotype): New functions.
(pool): Use pool_summaries.
(encrypt): Use encrypt_genotype.
* tests/test_pyhegp.py: Import pandas; Summary, read_summary and
read_genotype from pyhegp.serialization.
(test_pool, test_encrypt): New tests.
* test-data/encrypt-test-encrypted-genotype.tsv,
test-data/encrypt-test-genotype.tsv, test-data/encrypt-test-key,
test-data/encrypt-test-summary, test-data/pool-test-complete-summary,
test-data/pool-test-summary1, test-data/pool-test-summary2: New files. | 
|  | * doc/file-formats.md (File formats)[key file]: New section.
* pyhegp/serialization.py: Import numpy.
(read_key, write_key): New functions.
* pyhegp/pyhegp.py: Import write_key from pyhegp.serialization.
(encrypt): Use write_key.
* tests/test_serialization.py: Import arrays and array_shapes from
hypothesis.extra.numpy; approx from pytest; read_key and write_key
from pyhegp.serialization.
(test_read_write_key_are_inverses): New test. | 
|  | * pyhegp/pyhegp.py (genotype_summary): New function.
(summary): Use genotype_summary.
(encrypt): Compute summary if not provided.
* tests/test_pyhegp.py (test_simple_workflow): Remove xfail mark. | 
|  | * README.md (How to use): Indent down into "Joint/federated analysis
with many data owners" section.
[Simple data sharing]: New section.
* doc/generate-images.sh: Add simple workflow.
* doc/workflow.png: Rename to doc/joint-workflow.png.
* doc/workflow.uml: Rename to doc/joint-workflow.uml.
* doc/simple-workflow.png, doc/simple-workflow.uml: New files.
* tests/test_pyhegp.py: Import pytest.
(test_simple_workflow): New test.
* test-data/genotype.tsv: New file. | 
|  | * tests/test_pyhegp.py: Import CliRunner from click.testing, and main
from pyhegp.pyhegp.
(test_joint_workflow): New test.
* test-data/genotype0.tsv, test-data/genotype1.tsv,
test-data/genotype2.tsv, test-data/genotype3.tsv: New files. | 
|  | * pyhegp/pyhegp.py: Import pandas.
(summary, pool, encrypt, cat): Use pandas data frames and new data
format.
* pyhegp/serialization.py: Import csv and pandas.
(Summary)[mean, std]: Delete fields.
[data]: New field.
(read_summary, write_summary, read_genotype, write_genotype): Use
pandas data frames and new data format.
* tests/test_serialization.py: Import column, columns and data_frames
from hypothesis.extra.pandas; pandas; negate from pyhegp.utils. Do not
import hypothesis.extra.numpy and approx from pytest.
(tabless_printable_ascii_text, chromosome_column, position_column,
reference_column, sample_names): New variables.
(summaries, genotype_reserved_column_name_p, genotype_frames): New
functions.
(test_read_write_summary_are_inverses): Use pandas data frames and new
data format.
(test_read_write_genotype_are_inverses): Use pandas for testing.
* doc/file-formats.md (File formats)[summary file]: Describe new
standard.
[genotype file]: New section.
* .guix/pyhegp-package.scm (pyhegp-package): Import python-pandas
from (gnu packages python-science).
(python-pyhegp)[propagated-inputs]: Add python-pandas.
* pyproject.toml (dependencies): Add pandas. | 
|  | * tests/test_pyhegp.py (negate): Move to pyhegp.utils.
Import negate from pyhegp.utils.
* pyhegp/utils.py: New file. | 
|  | * .gitignore: New file. | 
|  | * tests/test_pyhegp.py (test_pool_stats): Set relative tolerance to
1e-6. | 
|  | * pyhegp/serialization.py (read_genotype): Rename genotype_file
argument to file. | 
|  | * tests/test_serialization.py: Import read_genotype and write_genotype
from pyhegp.serialization.
(test_read_write_genotype_are_inverses): New test. | 
|  | * pyhegp/serialization.py (read_genotype): Ensure 2 dimensions. | 
|  | * pyhegp/serialization.py (write_genotype): Write with format %.8g. | 
|  | * pyhegp/serialization.py (read_summary, write_summary): Use tab as
the delimiter.
* doc/file-formats.md (File formats)[summary file]: Update
documentation. | 
|  | * pyhegp/serialization.py (write_genotype): New function.
* pyhegp/pyhegp.py: Import write_genotype from pyhegp.serialization.
(encrypt, cat): Use write_genotype. | 
|  | * tests/test_pyhegp.py: Import math.
(square_matrices, negate, is_singular): New functions.
(test_conservation_of_solutions): New test. | 
|  | * pyhegp/pyhegp.py (hegp_encrypt, hegp_decrypt): Do not standardize or
unstandardize.
(encrypt): Standardize before calling hegp_encrypt.
* tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses):
Do not pass mean and standard deviation for standardization and
unstandardization. | 
|  | * tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses):
Do not test encryption on order 1 matrices. | 
|  | * README.md: Mention TianjingZhao2023 paper. | 
|  | * README.md: Add CI badge. | 
|  | * doc/workflow.uml, doc/workflow.png, doc/generate-images.sh: New
files.
* README.md (How to use): New section. | 
|  | * README.md (Install development version): New section. | 
|  | * pyhegp/pyhegp.py (cat): New function. | 
|  | * pyhegp/pyhegp.py (encrypt): Only output key to file optionally. | 
|  | * pyhegp/pyhegp.py (encrypt): Use File instead of Path for options. | 
|  | Prefixed options are easier to follow than the order of positional
arguments.
* pyhegp/pyhegp.py (encrypt): Turn summary, key and ciphertext
arguments into options. | 
|  | * pyhegp/pyhegp.py: Import read_genotype from pyhegp.serialization.
(read_genotype): Move to pyhegp.serialization. | 
|  | * pyhegp/pyhegp.py (hegp_encrypt): Standardize before encryption.
(hegp_decrypt): Unstandardize after decryption.
(encrypt): Pass in mean and standard deviation from summary file to
hegp_encrypt.
* tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses):
Pass in mean and standard deviation to hegp_encrypt. | 
|  | * pyhegp/pyhegp.py (standardize): Standardize using mean and standard
deviation, instead of the minor allele frequency.
(unstandardize): New function.
* tests/test_pyhegp.py: Import standardize and unstandardize from
pyhegp.pyhegp.
(no_column_zero_standard_deviation): New function.
(test_standardize_unstandardize_are_inverses): New test. | 
|  | * pyhegp/pyhegp.py: Import namedtuple from collections, and
read_summary from pyhegp.serialization.
(Stats): New type.
(pool_stats, pool): New functions.
* tests/test_pyhegp.py: Import Stats and pool_stats from
pyhegp.pyhegp.
(test_pool_stats): New test. |