about summary refs log tree commit diff
path: root/tests
AgeCommit message (Collapse)Author
6 daysAdd is_genotype_metadata_column.Arun Isaac
Promote genotype_reserved_column_name_p from helpers.strategies to is_genotype_metadata_column in pyhegp.serialization, and use it everywhere.
6 daysDrop duplicates in generated test genotype frames.Arun Isaac
6 daysMove hypothesis strategies to separate file.Arun Isaac
These strategies may be used by other test modules as well.
7 daysSuffix CLI subcommand functions with _command.Arun Isaac
We distinguish CLI subcommand functions using the _command suffix. This way, we don't have to concoct weird names for the actual workhorse functions. To remain consistent, we also suffix _command to the command testing functions.
7 daysDo not require output ciphertext file path.Arun Isaac
Make output ciphertext file path implicit; infer it by appending ".hegp" to the plaintext file. We take inspiration from GnuPG.
7 daysUse open method of Path object, rather than the open function.Arun Isaac
7 daysTest for existence of output files.Arun Isaac
We were testing for zero exit status. Now, in addition, we test for the existence of output files. This is slightly more robust.
7 daysAdd phenotype file format and serialization functions.Arun Isaac
2025-08-06Subset to common SNPs.Arun Isaac
* pyhegp/pyhegp.py: Import reduce from functools. (pool_summaries, encrypt_genotype): New functions. (pool): Use pool_summaries. (encrypt): Use encrypt_genotype. * tests/test_pyhegp.py: Import pandas; Summary, read_summary and read_genotype from pyhegp.serialization. (test_pool, test_encrypt): New tests. * test-data/encrypt-test-encrypted-genotype.tsv, test-data/encrypt-test-genotype.tsv, test-data/encrypt-test-key, test-data/encrypt-test-summary, test-data/pool-test-complete-summary, test-data/pool-test-summary1, test-data/pool-test-summary2: New files.
2025-08-06Standardize key files.Arun Isaac
* doc/file-formats.md (File formats)[key file]: New section. * pyhegp/serialization.py: Import numpy. (read_key, write_key): New functions. * pyhegp/pyhegp.py: Import write_key from pyhegp.serialization. (encrypt): Use write_key. * tests/test_serialization.py: Import arrays and array_shapes from hypothesis.extra.numpy; approx from pytest; read_key and write_key from pyhegp.serialization. (test_read_write_key_are_inverses): New test.
2025-08-06Compute summary on encryption if not provided.Arun Isaac
* pyhegp/pyhegp.py (genotype_summary): New function. (summary): Use genotype_summary. (encrypt): Compute summary if not provided. * tests/test_pyhegp.py (test_simple_workflow): Remove xfail mark.
2025-08-06Add simple workflow.Arun Isaac
* README.md (How to use): Indent down into "Joint/federated analysis with many data owners" section. [Simple data sharing]: New section. * doc/generate-images.sh: Add simple workflow. * doc/workflow.png: Rename to doc/joint-workflow.png. * doc/workflow.uml: Rename to doc/joint-workflow.uml. * doc/simple-workflow.png, doc/simple-workflow.uml: New files. * tests/test_pyhegp.py: Import pytest. (test_simple_workflow): New test. * test-data/genotype.tsv: New file.
2025-08-06Test joint workflow CLI.Arun Isaac
* tests/test_pyhegp.py: Import CliRunner from click.testing, and main from pyhegp.pyhegp. (test_joint_workflow): New test. * test-data/genotype0.tsv, test-data/genotype1.tsv, test-data/genotype2.tsv, test-data/genotype3.tsv: New files.
2025-08-06Standardize file formats in the likeness of plink files.Arun Isaac
* pyhegp/pyhegp.py: Import pandas. (summary, pool, encrypt, cat): Use pandas data frames and new data format. * pyhegp/serialization.py: Import csv and pandas. (Summary)[mean, std]: Delete fields. [data]: New field. (read_summary, write_summary, read_genotype, write_genotype): Use pandas data frames and new data format. * tests/test_serialization.py: Import column, columns and data_frames from hypothesis.extra.pandas; pandas; negate from pyhegp.utils. Do not import hypothesis.extra.numpy and approx from pytest. (tabless_printable_ascii_text, chromosome_column, position_column, reference_column, sample_names): New variables. (summaries, genotype_reserved_column_name_p, genotype_frames): New functions. (test_read_write_summary_are_inverses): Use pandas data frames and new data format. (test_read_write_genotype_are_inverses): Use pandas for testing. * doc/file-formats.md (File formats)[summary file]: Describe new standard. [genotype file]: New section. * .guix/pyhegp-package.scm (pyhegp-package): Import python-pandas from (gnu packages python-science). (python-pyhegp)[propagated-inputs]: Add python-pandas. * pyproject.toml (dependencies): Add pandas.
2025-08-06Move negate to pyhegp.utils.Arun Isaac
* tests/test_pyhegp.py (negate): Move to pyhegp.utils. Import negate from pyhegp.utils. * pyhegp/utils.py: New file.
2025-08-06Loosen relative tolerance in test_pool_stats.Arun Isaac
* tests/test_pyhegp.py (test_pool_stats): Set relative tolerance to 1e-6.
2025-08-01Test that read_genotype and write_genotype are inverses.Arun Isaac
* tests/test_serialization.py: Import read_genotype and write_genotype from pyhegp.serialization. (test_read_write_genotype_are_inverses): New test.
2025-08-01Test solution of linear system after encryption.Arun Isaac
* tests/test_pyhegp.py: Import math. (square_matrices, negate, is_singular): New functions. (test_conservation_of_solutions): New test.
2025-08-01Separate standardization from encryption.Arun Isaac
* pyhegp/pyhegp.py (hegp_encrypt, hegp_decrypt): Do not standardize or unstandardize. (encrypt): Standardize before calling hegp_encrypt. * tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Do not pass mean and standard deviation for standardization and unstandardization.
2025-08-01Do not test encryption on order 1 matrices.Arun Isaac
* tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Do not test encryption on order 1 matrices.
2025-07-17Standardize before encryption.Arun Isaac
* pyhegp/pyhegp.py (hegp_encrypt): Standardize before encryption. (hegp_decrypt): Unstandardize after decryption. (encrypt): Pass in mean and standard deviation from summary file to hegp_encrypt. * tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Pass in mean and standard deviation to hegp_encrypt.
2025-07-17Add standardization.Arun Isaac
* pyhegp/pyhegp.py (standardize): Standardize using mean and standard deviation, instead of the minor allele frequency. (unstandardize): New function. * tests/test_pyhegp.py: Import standardize and unstandardize from pyhegp.pyhegp. (no_column_zero_standard_deviation): New function. (test_standardize_unstandardize_are_inverses): New test.
2025-07-17Add pool subcommand.Arun Isaac
* pyhegp/pyhegp.py: Import namedtuple from collections, and read_summary from pyhegp.serialization. (Stats): New type. (pool_stats, pool): New functions. * tests/test_pyhegp.py: Import Stats and pool_stats from pyhegp.pyhegp. (test_pool_stats): New test.
2025-07-17Implement the summary file format.Arun Isaac
* doc/file-formats.md, pyhegp/serialization.py, tests/test_serialization.py: New files.
2025-07-17Use default array shapes testing encryption/decryption.Arun Isaac
It may be better to sample a smaller set of matrices finely than a large set of matrices coarsely. * tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Use default array shapes testing encryption/decryption.
2025-07-17Reduce maximum matrix size testing encryption/decryption.Arun Isaac
* tests/test_pyhegp.py (test_hegp_encryption_decryption_are_inverses): Reduce maximum matrix size to 100.
2025-07-17Organize source into directory structure.Arun Isaac
* pyhegp/__init__.py: New file. * pyhegp.py: Move to pyhegp/pyhegp.py. * test_pyhegp.py: Move to tests/test_pyhegp.py. Import from pyhegp.pyhegp instead of from pyhegp. * pyproject.toml (project.scripts)[pyhegp]: Switch to pyhegp.pyhegp:main.