about summary refs log tree commit diff
path: root/doc
AgeCommit message (Collapse)Author
2025-09-08Add @startuml and @enduml tags.Arun Isaac
2025-09-01Title case sentence.Arun Isaac
2025-09-01Add phenotype file format and serialization functions.Arun Isaac
2025-08-08Add example key file.Arun Isaac
2025-08-08Add example genotype file.Arun Isaac
2025-08-08Add example summary file.Arun Isaac
2025-08-06Standardize key files.Arun Isaac
* doc/file-formats.md (File formats)[key file]: New section. * pyhegp/serialization.py: Import numpy. (read_key, write_key): New functions. * pyhegp/pyhegp.py: Import write_key from pyhegp.serialization. (encrypt): Use write_key. * tests/test_serialization.py: Import arrays and array_shapes from hypothesis.extra.numpy; approx from pytest; read_key and write_key from pyhegp.serialization. (test_read_write_key_are_inverses): New test.
2025-08-06Add simple workflow.Arun Isaac
* README.md (How to use): Indent down into "Joint/federated analysis with many data owners" section. [Simple data sharing]: New section. * doc/generate-images.sh: Add simple workflow. * doc/workflow.png: Rename to doc/joint-workflow.png. * doc/workflow.uml: Rename to doc/joint-workflow.uml. * doc/simple-workflow.png, doc/simple-workflow.uml: New files. * tests/test_pyhegp.py: Import pytest. (test_simple_workflow): New test. * test-data/genotype.tsv: New file.
2025-08-06Standardize file formats in the likeness of plink files.Arun Isaac
* pyhegp/pyhegp.py: Import pandas. (summary, pool, encrypt, cat): Use pandas data frames and new data format. * pyhegp/serialization.py: Import csv and pandas. (Summary)[mean, std]: Delete fields. [data]: New field. (read_summary, write_summary, read_genotype, write_genotype): Use pandas data frames and new data format. * tests/test_serialization.py: Import column, columns and data_frames from hypothesis.extra.pandas; pandas; negate from pyhegp.utils. Do not import hypothesis.extra.numpy and approx from pytest. (tabless_printable_ascii_text, chromosome_column, position_column, reference_column, sample_names): New variables. (summaries, genotype_reserved_column_name_p, genotype_frames): New functions. (test_read_write_summary_are_inverses): Use pandas data frames and new data format. (test_read_write_genotype_are_inverses): Use pandas for testing. * doc/file-formats.md (File formats)[summary file]: Describe new standard. [genotype file]: New section. * .guix/pyhegp-package.scm (pyhegp-package): Import python-pandas from (gnu packages python-science). (python-pyhegp)[propagated-inputs]: Add python-pandas. * pyproject.toml (dependencies): Add pandas.
2025-08-01Tab-separate data section of summary files.Arun Isaac
* pyhegp/serialization.py (read_summary, write_summary): Use tab as the delimiter. * doc/file-formats.md (File formats)[summary file]: Update documentation.
2025-07-17Document usage instructions and workflow.Arun Isaac
* doc/workflow.uml, doc/workflow.png, doc/generate-images.sh: New files. * README.md (How to use): New section.
2025-07-17Implement the summary file format.Arun Isaac
* doc/file-formats.md, pyhegp/serialization.py, tests/test_serialization.py: New files.