From 1ed0e16a4707222e07a68f57d231af1cd00fea73 Mon Sep 17 00:00:00 2001 From: Arun Isaac Date: Mon, 14 Jul 2025 14:25:27 +0100 Subject: Implement the summary file format. * doc/file-formats.md, pyhegp/serialization.py, tests/test_serialization.py: New files. --- doc/file-formats.md | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 doc/file-formats.md (limited to 'doc') diff --git a/doc/file-formats.md b/doc/file-formats.md new file mode 100644 index 0000000..27dfe2a --- /dev/null +++ b/doc/file-formats.md @@ -0,0 +1,11 @@ +# File formats +## summary file + +The summary file is ASCII encoded. It consists of two sections—the header and the data. Lines MUST be terminated in the Unix style with a new line (aka line feed) character. Lines in the header section MUST be prefixed with `#`. + +The first line of the header section MUST be `# pyhegp summary file version 1`. Subsequent lines of the header section are a list of key-value pairs. Each line MUST be `#`, optional whitespace, the key, a single space character and then the value. The key MUST NOT contain whitespace or control characters, and MUST NOT begin with a `#` character. The value MAY contain whitespace characters, but MUST NOT contain control characters. + +The data section is a space separated table of numbers. The first line of the data section is a vector of means—one for each SNP. The second line is a vector of standard deviations—one for each SNP. + +Here is an example summary file. +`TODO: Add example.` -- cgit v1.2.3