Age | Commit message (Collapse) | Author | |
---|---|---|---|
4 days | Deduplicate genotype frame metadata generation. | Arun Isaac | |
Abstract out generation of genotype frame metadata (namely chromosome, position and reference) from summaries and genotype_frames into a new helper function genotype_metadata. | |||
4 days | Drop SNPs with a zero standard deviation. | Arun Isaac | |
4 days | Fix typo in comment: tha->that. | Arun Isaac | |
5 days | Avoid wildcard import from helpers.strategies. | Arun Isaac | |
5 days | Limit values in genotype and phenotype strategies. | Arun Isaac | |
5 days | Test that ciphertext does not contain NA values. | Arun Isaac | |
5 days | Parameterize number of samples in phenotype frame strategy. | Arun Isaac | |
5 days | Parameterize number of samples in genotype frame strategy. | Arun Isaac | |
5 days | Parameterize presence of reference column in genotype frame strategy. | Arun Isaac | |
5 days | Add keys strategy. | Arun Isaac | |
Add keys strategy, and use it. | |||
5 days | Raise exception if data frame to be written has NA values. | Arun Isaac | |
This should never occur, but can occur due to bugs in the code; we wish to protect against that. | |||
5 days | Add --force flag to encrypt subcommand permitting file overwriting. | Arun Isaac | |
6 days | Support encrypting phenotypes. | Arun Isaac | |
6 days | Compare complete frame in test_cat_*. | Arun Isaac | |
It is so much simpler and much more robust to simply compare expected and actual data frames. | |||
6 days | Do not import unused settings from hypothesis. | Arun Isaac | |
6 days | Test cat_phenotype. | Arun Isaac | |
6 days | Add cat-phenotype subcommand. | Arun Isaac | |
7 days | Rename cat subcommand to cat-genotype. | Arun Isaac | |
A cat-phenotype subcommand is coming. Hence rename this. | |||
7 days | Add is_phenotype_metadata_column. | Arun Isaac | |
Promote phenotype_reserved_column_name_p from helpers.strategies to is_phenotype_metadata_column in pyhegp.serialization. | |||
7 days | Drop duplicates in generated test phenotype frames. | Arun Isaac | |
7 days | Set CI environment variable when building Guix package. | Arun Isaac | |
7 days | Merge, not concat, genotype frames. | Arun Isaac | |
pd.concat duplicates the metadata columns, and is generally the wrong approach to the problem. | |||
7 days | Test cat_genotype. | Arun Isaac | |
Test cat_genotype extensively using hypothesis. | |||
7 days | Add is_genotype_metadata_column. | Arun Isaac | |
Promote genotype_reserved_column_name_p from helpers.strategies to is_genotype_metadata_column in pyhegp.serialization, and use it everywhere. | |||
7 days | Drop duplicates in generated test genotype frames. | Arun Isaac | |
7 days | Catenate an empty list of genotypes. | Arun Isaac | |
We handle this as a special case. | |||
7 days | Move hypothesis strategies to separate file. | Arun Isaac | |
These strategies may be used by other test modules as well. | |||
7 days | Add cat_genotype workhorse function. | Arun Isaac | |
Move workhorse logic of the cat command to a separate function. This will make it easy to test the logic without having to invoke the command itself. | |||
7 days | Suffix CLI subcommand functions with _command. | Arun Isaac | |
We distinguish CLI subcommand functions using the _command suffix. This way, we don't have to concoct weird names for the actual workhorse functions. To remain consistent, we also suffix _command to the command testing functions. | |||
8 days | Do not require output ciphertext file path. | Arun Isaac | |
Make output ciphertext file path implicit; infer it by appending ".hegp" to the plaintext file. We take inspiration from GnuPG. | |||
8 days | Pass dtype to read_csv. | Arun Isaac | |
read_csv can incorrectly infer that the string "00" is the integer 0. To avoid this ambiguity, pass the correct dtype to read_csv. | |||
8 days | Use open method of Path object, rather than the open function. | Arun Isaac | |
8 days | Do not skip blank lines when reading TSV files. | Arun Isaac | |
8 days | Decide to not use logging. | Arun Isaac | |
Remove comments mentioning logging. Command-line error messages have their own place; they are not the same as logging. | |||
8 days | Test for existence of output files. | Arun Isaac | |
We were testing for zero exit status. Now, in addition, we test for the existence of output files. This is slightly more robust. | |||
8 days | Title case sentence. | Arun Isaac | |
8 days | Add phenotype file format and serialization functions. | Arun Isaac | |
2025-08-08 | Clarify that the test suite is not for end users. | Arun Isaac | |
End users who install pyhegp via pip cannot run the test suite. Clarify this in the README. Perhaps, in the future, we should move these developer-oriented instructions to a separate document. | |||
2025-08-08 | Separate table of contents from introduction. | Arun Isaac | |
If not separated, GitHub combines the table of contents with the list of papers in the introduction. | |||
2025-08-08 | Add table of contents to README. | Arun Isaac | |
A table of contents gives people a brief overview of what's in the README, and allows them to jump to the section they are interested in. | |||
2025-08-08 | Replace csv extension with tsv extension on genotype files. | Arun Isaac | |
2025-08-08 | Remove txt extension from summary files. | Arun Isaac | |
2025-08-08 | Link to file formats documentation from README. | Arun Isaac | |
Readers are more likely to follow through to the file formats documentation if there is a link. | |||
2025-08-08 | Add example key file. | Arun Isaac | |
2025-08-08 | Add example genotype file. | Arun Isaac | |
2025-08-08 | Add example summary file. | Arun Isaac | |
2025-08-08 | Reduce precision in test data files. | Arun Isaac | |
Reducing precision lowers the file size and makes the files more human-comprehensible. | |||
2025-08-08 | Add instructions to install via Guix. | Arun Isaac | |
2025-08-08 | Mark virtual environment creation as optional. | Arun Isaac | |
Not everyone may want to create a virtual environment. For example, on some HPC machines, creating a virtual environment is complicated or does not work. | |||
2025-08-08 | Package as a CLI utility only, not a Python library. | Arun Isaac | |
We have not exposed a Python library interface, and it is not clear if we need to. We can revisit this decision later, if need be. |