about summary refs log tree commit diff
AgeCommit message (Collapse)Author
5 daysAdd instructions to install via Guix.Arun Isaac
5 daysMark virtual environment creation as optional.Arun Isaac
Not everyone may want to create a virtual environment. For example, on some HPC machines, creating a virtual environment is complicated or does not work.
5 daysPackage as a CLI utility only, not a Python library.Arun Isaac
We have not exposed a Python library interface, and it is not clear if we need to. We can revisit this decision later, if need be.
7 daysSubset to common SNPs.Arun Isaac
* pyhegp/pyhegp.py: Import reduce from functools. (pool_summaries, encrypt_genotype): New functions. (pool): Use pool_summaries. (encrypt): Use encrypt_genotype. * tests/test_pyhegp.py: Import pandas; Summary, read_summary and read_genotype from pyhegp.serialization. (test_pool, test_encrypt): New tests. * test-data/encrypt-test-encrypted-genotype.tsv, test-data/encrypt-test-genotype.tsv, test-data/encrypt-test-key, test-data/encrypt-test-summary, test-data/pool-test-complete-summary, test-data/pool-test-summary1, test-data/pool-test-summary2: New files.
7 daysStandardize key files.Arun Isaac
* doc/file-formats.md (File formats)[key file]: New section. * pyhegp/serialization.py: Import numpy. (read_key, write_key): New functions. * pyhegp/pyhegp.py: Import write_key from pyhegp.serialization. (encrypt): Use write_key. * tests/test_serialization.py: Import arrays and array_shapes from hypothesis.extra.numpy; approx from pytest; read_key and write_key from pyhegp.serialization. (test_read_write_key_are_inverses): New test.
7 daysCompute summary on encryption if not provided.Arun Isaac
* pyhegp/pyhegp.py (genotype_summary): New function. (summary): Use genotype_summary. (encrypt): Compute summary if not provided. * tests/test_pyhegp.py (test_simple_workflow): Remove xfail mark.
7 daysAdd simple workflow.Arun Isaac
* README.md (How to use): Indent down into "Joint/federated analysis with many data owners" section. [Simple data sharing]: New section. * doc/generate-images.sh: Add simple workflow. * doc/workflow.png: Rename to doc/joint-workflow.png. * doc/workflow.uml: Rename to doc/joint-workflow.uml. * doc/simple-workflow.png, doc/simple-workflow.uml: New files. * tests/test_pyhegp.py: Import pytest. (test_simple_workflow): New test. * test-data/genotype.tsv: New file.
7 daysTest joint workflow CLI.Arun Isaac
* tests/test_pyhegp.py: Import CliRunner from click.testing, and main from pyhegp.pyhegp. (test_joint_workflow): New test. * test-data/genotype0.tsv, test-data/genotype1.tsv, test-data/genotype2.tsv, test-data/genotype3.tsv: New files.
7 daysStandardize file formats in the likeness of plink files.Arun Isaac
* pyhegp/pyhegp.py: Import pandas. (summary, pool, encrypt, cat): Use pandas data frames and new data format. * pyhegp/serialization.py: Import csv and pandas. (Summary)[mean, std]: Delete fields. [data]: New field. (read_summary, write_summary, read_genotype, write_genotype): Use pandas data frames and new data format. * tests/test_serialization.py: Import column, columns and data_frames from hypothesis.extra.pandas; pandas; negate from pyhegp.utils. Do not import hypothesis.extra.numpy and approx from pytest. (tabless_printable_ascii_text, chromosome_column, position_column, reference_column, sample_names): New variables. (summaries, genotype_reserved_column_name_p, genotype_frames): New functions. (test_read_write_summary_are_inverses): Use pandas data frames and new data format. (test_read_write_genotype_are_inverses): Use pandas for testing. * doc/file-formats.md (File formats)[summary file]: Describe new standard. [genotype file]: New section. * .guix/pyhegp-package.scm (pyhegp-package): Import python-pandas from (gnu packages python-science). (python-pyhegp)[propagated-inputs]: Add python-pandas. * pyproject.toml (dependencies): Add pandas.
7 daysMove negate to pyhegp.utils.Arun Isaac
* tests/test_pyhegp.py (negate): Move to pyhegp.utils. Import negate from pyhegp.utils. * pyhegp/utils.py: New file.
7 daysAdd gitignore.Arun Isaac
* .gitignore: New file.
7 daysLoosen relative tolerance in test_pool_stats.Arun Isaac
* tests/test_pyhegp.py (test_pool_stats): Set relative tolerance to 1e-6.
12 daysRename genotype_file argument in read_genotype.Arun Isaac
* pyhegp/serialization.py (read_genotype): Rename genotype_file argument to file.
12 daysTest that read_genotype and write_genotype are inverses.Arun Isaac
* tests/test_serialization.py: Import read_genotype and write_genotype from pyhegp.serialization. (test_read_write_genotype_are_inverses): New test.
12 daysEnsure that read genotype matrices have 2 dimensions.Arun Isaac
* pyhegp/serialization.py (read_genotype): Ensure 2 dimensions.
12 daysWrite genotype matrix with increased precision.Arun Isaac
* pyhegp/serialization.py (write_genotype): Write with format %.8g.
12 daysTab-separate data section of summary files.Arun Isaac
* pyhegp/serialization.py (read_summary, write_summary): Use tab as the delimiter. * doc/file-formats.md (File formats)[summary file]: Update documentation.
12 daysAbstract out write_genotype.Arun Isaac
* pyhegp/serialization.py (write_genotype): New function. * pyhegp/pyhegp.py: Import write_genotype from pyhegp.serialization. (encrypt, cat): Use write_genotype.
13 daysTest solution of linear system after encryption.Arun Isaac
* tests/test_pyhegp.py: Import math. (square_matrices, negate, is_singular): New functions. (test_conservation_of_solutions): New test.
13 daysSeparate standardization from encryption.Arun Isaac
* pyhegp/pyhegp.py (hegp_encrypt, hegp_decrypt): Do not standardize or unstandardize. (encrypt): Standardize before calling hegp_encrypt. * tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Do not pass mean and standard deviation for standardization and unstandardization.
13 daysDo not test encryption on order 1 matrices.Arun Isaac
* tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Do not test encryption on order 1 matrices.
13 daysMention TianjingZhao2023 paper in README.Arun Isaac
* README.md: Mention TianjingZhao2023 paper.
2025-07-18Add CI badge to README.Arun Isaac
* README.md: Add CI badge.
2025-07-17Document usage instructions and workflow.Arun Isaac
* doc/workflow.uml, doc/workflow.png, doc/generate-images.sh: New files. * README.md (How to use): New section.
2025-07-17Add development version installation instructions.Arun Isaac
* README.md (Install development version): New section.
2025-07-17Add cat subcommand.Arun Isaac
* pyhegp/pyhegp.py (cat): New function.
2025-07-17Only output key optionally.Arun Isaac
* pyhegp/pyhegp.py (encrypt): Only output key to file optionally.
2025-07-17Use File instead of Path for encrypt subcommand options.Arun Isaac
* pyhegp/pyhegp.py (encrypt): Use File instead of Path for options.
2025-07-17Turn arguments of the encrypt subcommand into options.Arun Isaac
Prefixed options are easier to follow than the order of positional arguments. * pyhegp/pyhegp.py (encrypt): Turn summary, key and ciphertext arguments into options.
2025-07-17Move read_genotype to pyhegp.serialization.Arun Isaac
* pyhegp/pyhegp.py: Import read_genotype from pyhegp.serialization. (read_genotype): Move to pyhegp.serialization.
2025-07-17Standardize before encryption.Arun Isaac
* pyhegp/pyhegp.py (hegp_encrypt): Standardize before encryption. (hegp_decrypt): Unstandardize after decryption. (encrypt): Pass in mean and standard deviation from summary file to hegp_encrypt. * tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Pass in mean and standard deviation to hegp_encrypt.
2025-07-17Add standardization.Arun Isaac
* pyhegp/pyhegp.py (standardize): Standardize using mean and standard deviation, instead of the minor allele frequency. (unstandardize): New function. * tests/test_pyhegp.py: Import standardize and unstandardize from pyhegp.pyhegp. (no_column_zero_standard_deviation): New function. (test_standardize_unstandardize_are_inverses): New test.
2025-07-17Add pool subcommand.Arun Isaac
* pyhegp/pyhegp.py: Import namedtuple from collections, and read_summary from pyhegp.serialization. (Stats): New type. (pool_stats, pool): New functions. * tests/test_pyhegp.py: Import Stats and pool_stats from pyhegp.pyhegp. (test_pool_stats): New test.
2025-07-17Add summary subcommand.Arun Isaac
* pyhegp/pyhegp.py: Import Summary and write_summary from pyhegp.serialization. (summary): New function.
2025-07-17Implement the summary file format.Arun Isaac
* doc/file-formats.md, pyhegp/serialization.py, tests/test_serialization.py: New files.
2025-07-17Remove decrypt subcommand.Arun Isaac
Decryption does not make much sense with HEGP. And, the added complexity of standardization makes it even less attractive. * pyhegp/pyhegp.py (decrypt): Delete function.
2025-07-17Use python-pytest built with python-hypothesis-next.Arun Isaac
* .guix/pyhegp-package.scm: Import python-pytest with guix: prefix. (python-pytest): New variable.
2025-07-17Use default array shapes testing encryption/decryption.Arun Isaac
It may be better to sample a smaller set of matrices finely than a large set of matrices coarsely. * tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Use default array shapes testing encryption/decryption.
2025-07-17Reduce maximum matrix size testing encryption/decryption.Arun Isaac
* tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Reduce maximum matrix size to 100.
2025-07-17Organize source into directory structure.Arun Isaac
* pyhegp/__init__.py: New file. * pyhegp.py: Move to pyhegp/pyhegp.py. * test_pyhegp.py: Move to tests/test_pyhegp.py. Import from pyhegp.pyhegp instead of from pyhegp. * pyproject.toml (project.scripts)[pyhegp]: Switch to pyhegp.pyhegp:main.
2025-07-17Use python-hypothesis-next.Arun Isaac
* guix.scm: Import python-hypothesis-next instead of python-hypothesis. (python-pyhegp)[native-inputs]: Replace python-hypothesis with python-hypothesis-next.
2025-07-08Correct symlink to guix package file.Arun Isaac
* guix.scm: Link to .guix/pyhegp-package.scm instead of .guix/pyhegp-project.scm.
2025-07-08Remove obsolete commented out code.Arun Isaac
* pyhegp.py (read_genotype): Remove obsolete commented out code.
2025-07-07Make repo a guix channel.Arun Isaac
* .guix-channel: New file. * guix.scm: Move to ... * .guix/pyhegp-package.scm: ... here as its own module. * guix.scm: Link to .guix/pyhegp-package.scm.
2025-07-07Add tests.Arun Isaac
* test_pyhegp.py: New file. * README.md (Run tests): New section. * guix.scm: Import python-hypothesis from (gnu packages check). (python-pyhegp)[arguments]: Enable tests. [native-inputs]: Add python-hypothesis.
2025-06-27README: Add missing preposition "in".Arun Isaac
* README.md: Add missing preposition "in".
2025-06-27Initial commitArun Isaac