Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
|
|
|
|
|
|
Add keys strategy, and use it.
|
|
This should never occur, but can occur due to bugs in the code; we
wish to protect against that.
|
|
|
|
|
|
It is so much simpler and much more robust to simply compare expected
and actual data frames.
|
|
|
|
|
|
|
|
A cat-phenotype subcommand is coming. Hence rename this.
|
|
Promote phenotype_reserved_column_name_p from helpers.strategies to
is_phenotype_metadata_column in pyhegp.serialization.
|
|
|
|
|
|
pd.concat duplicates the metadata columns, and is generally the wrong
approach to the problem.
|
|
Test cat_genotype extensively using hypothesis.
|
|
Promote genotype_reserved_column_name_p from helpers.strategies to
is_genotype_metadata_column in pyhegp.serialization, and use it
everywhere.
|
|
|
|
We handle this as a special case.
|
|
These strategies may be used by other test modules as well.
|
|
Move workhorse logic of the cat command to a separate function. This
will make it easy to test the logic without having to invoke the
command itself.
|
|
We distinguish CLI subcommand functions using the _command suffix.
This way, we don't have to concoct weird names for the actual
workhorse functions.
To remain consistent, we also suffix _command to the command testing
functions.
|
|
Make output ciphertext file path implicit; infer it by appending
".hegp" to the plaintext file. We take inspiration from GnuPG.
|
|
read_csv can incorrectly infer that the string "00" is the integer 0.
To avoid this ambiguity, pass the correct dtype to read_csv.
|
|
|
|
|
|
Remove comments mentioning logging.
Command-line error messages have their own place; they are not the
same as logging.
|
|
We were testing for zero exit status. Now, in addition, we test for
the existence of output files. This is slightly more robust.
|
|
|
|
|
|
End users who install pyhegp via pip cannot run the test suite.
Clarify this in the README. Perhaps, in the future, we should move
these developer-oriented instructions to a separate document.
|
|
If not separated, GitHub combines the table of contents with the list
of papers in the introduction.
|
|
A table of contents gives people a brief overview of what's in the
README, and allows them to jump to the section they are interested in.
|
|
|
|
|
|
Readers are more likely to follow through to the file formats
documentation if there is a link.
|
|
|
|
|
|
|
|
Reducing precision lowers the file size and makes the files more
human-comprehensible.
|
|
|
|
Not everyone may want to create a virtual environment. For example, on
some HPC machines, creating a virtual environment is complicated or
does not work.
|
|
We have not exposed a Python library interface, and it is not clear if
we need to. We can revisit this decision later, if need be.
|
|
* pyhegp/pyhegp.py: Import reduce from functools.
(pool_summaries, encrypt_genotype): New functions.
(pool): Use pool_summaries.
(encrypt): Use encrypt_genotype.
* tests/test_pyhegp.py: Import pandas; Summary, read_summary and
read_genotype from pyhegp.serialization.
(test_pool, test_encrypt): New tests.
* test-data/encrypt-test-encrypted-genotype.tsv,
test-data/encrypt-test-genotype.tsv, test-data/encrypt-test-key,
test-data/encrypt-test-summary, test-data/pool-test-complete-summary,
test-data/pool-test-summary1, test-data/pool-test-summary2: New files.
|
|
* doc/file-formats.md (File formats)[key file]: New section.
* pyhegp/serialization.py: Import numpy.
(read_key, write_key): New functions.
* pyhegp/pyhegp.py: Import write_key from pyhegp.serialization.
(encrypt): Use write_key.
* tests/test_serialization.py: Import arrays and array_shapes from
hypothesis.extra.numpy; approx from pytest; read_key and write_key
from pyhegp.serialization.
(test_read_write_key_are_inverses): New test.
|
|
* pyhegp/pyhegp.py (genotype_summary): New function.
(summary): Use genotype_summary.
(encrypt): Compute summary if not provided.
* tests/test_pyhegp.py (test_simple_workflow): Remove xfail mark.
|