aboutsummaryrefslogtreecommitdiff
path: root/email
AgeCommit message (Expand)Author
2023-09-03email: Tolerate parentheses in display names.•••* email/email.scm (define-atom-pattern): Support customization of the atext pattern as well. (define-phrase-pattern): New macro. (obs-phrase): Define using define-phrase-pattern. (liberal-atext, liberal-cfws-captured-atom, liberal-cfws-captured-word, liberal-phrase): New patterns. (display-name): Use liberal-phrase instead of phrase. * tests/email.scm ("tolerate email addresses with parentheses in name"): New test. Arun Isaac
2023-01-06email: Support Date fields with missing seconds.•••* email/email.scm (parse-email-headers): Extend the date-time parser to match when seconds are missing, defaulting to "0". * tests/email.scm ("parse Date", "parse Date without seconds"): New tests. Signed-off-by: Arun Isaac <arunisaac@systemreboot.net> Andrew Whatson
2023-01-03email: Support quoted-printable CR LF sequences.•••* email/quoted-printable.scm (quoted-printable-decode): Ignore "=\r\n" sequences in the input. * tests/quoted-printable.scm ("quoted-printable decoding of soft line breaks (=\\n)", "quoted-printable decoding of soft line breaks (=\\r\\n)"): New tests. Signed-off-by: Arun Isaac <arunisaac@systemreboot.net> Andrew Whatson
2021-10-24email: Handle Received header with two tokens but no timestamp.•••* email/email.scm (parse-email-headers): Match Received header with timestamp more precisely. * tests/email.scm ("Parse Received header with two tokens but no timestamp"): No test. Arun Isaac
2021-10-24email: Handle unrecognized Content-Transfer-Encoding headers.•••* email/email.scm (handle-invalid-headers): New function. (parse-email-headers): Handle invalid headers. * tests/email.scm ("Assume application/octet-stream Content-Type if Content-Transfer-Encoding is unrecognized"): New test. Arun Isaac
2021-10-02email: Do not use an empty bytevector to test the charset.•••Using an empty bytevector no longer throws an exception since Guile commit 5ea8c69e9153a970952bf6f0b32c4fad6a28e839. * email/email.scm (post-process-content-transfer-encoding): Use a bytevector of unit length to test the charset validity. Signed-off-by: Arun Isaac <arunisaac@systemreboot.net> Mathieu Othacehe
2021-03-15email: Use only cfws-captured-words in obs-phrase.•••* email/email.scm (obs-phrase): Replace word with cfws-captured-word. * tests/email.scm ("Parse names with more than two words"): New test. Arun Isaac
2020-12-05email: Indent better.•••* email/email.scm (define-cfws-pattern): Indent better. Arun Isaac
2020-12-05email: Give higher precedence to obsolete id-left, id-right patterns.•••* email/email.scm (id-left, id-right): Give higher precedence to obsolete patterns. Arun Isaac
2020-12-05email: Support remaining obsolete specification.•••* email/email.scm (obs-phrase-list, obs-utext, obs-unstruct, obs-optional): New macros. (unstructured, in-reply-to, references, keywords, optional-field): Include obsolete patterns. Arun Isaac
2020-12-05email: Support obsolete Received header.•••* email/email.scm (received): Include obsolete pattern. (parse-mime-entity): Post process obsolete received forms. Arun Isaac
2020-12-05email: Do not capture cfws in atoms and dot-atoms.•••* email/email.scm (define-atom-pattern): Do not capture cfws unless specified. (atom): Do not specify cfws. (define-dot-atom-pattern): Do not capture cfws. (define-word-pattern): New macro. (cfws-captured-atom, cfws-captured-word): New patterns. (obs-phrase): Use cfws-captured-word. (received-token): Capture all. (parse-mime-entity): Post process received and received-token. * tests/email.scm ("parse email headers"): Fix test. Arun Isaac
2020-12-05email: Support obsolete date and time.•••* email/email.scm (obs-day-of-week, obs-day, obs-year, obs-hour, obs-minute, obs-second, obs-zone): New macros. (day-of-week, day, year, hours, minutes, seconds, zone): Include obsolete pattern. (parse-email-headers): Handle obsolete two and three digit years, and alphabetic time zone specifiers. * tests/email.scm ("RFC5322 A.6.2. Obsolete dates"): New test. Arun Isaac
2020-12-05email: Support obsolete addressing.•••* email/email.scm (obs-qp, obs-fws, obs-no-ws-ctl, obs-ctext, obs-qtext, obs-phrase, obs-local-part, obs-dtext, obs-domain, obs-domain-list, obs-route, obs-angle-addr, captured-atom, captured-obs-domain, captured-domain, obs-mbox-list, obs-group-list, obs-addr-list, obs-id-left, obs-id-right): New patterns. (quoted-pair, fws, ctext, qtext, phrase, dtext, define-angle-addr-pattern, mailbox-list, group-list, address-list, define-field-pattern, from, sender, bcc, id-left, id-right, resent-from, resent-sender, resent-bcc, obs-resent-rply): Include obsolete pattern. (define-printable-ascii-character-pattern-with-obsolete, define-atom-pattern, define-obs-domain-pattern): New macros. (define-domain-pattern): Accept obs-domain as a new argument. (fields): Include obs-resent-rply. * tests/email.scm ("RFC5322 A.6.1. Obsolete addressing"): New test. ("parse email addresses with period in name"): Mark as passing. Arun Isaac
2020-05-25tests: Test inputs of different lengths.•••* tests/base64.scm ("base64 random bytevector: base64-encode and base64-decode are inverses of each other", "base64 random bytevector: encoded output should not be more than 76 columns wide", "base64 random bytevector: encoded output must only consist of characters from the base64 alphabet"): Test inputs of different lengths. * tests/quoted-printable.scm ("quoted-printable random bytevector: quoted-printable-encode and quoted-printable-decode are inverses of each other", "quoted-printable random bytevector: encoded output should not be more than 76 columns wide", "quoted-printable random bytevector: encoded output must only consist of printable ASCII characters", "q-encoding random bytevector: q-encoding-encode and q-encoding-decode are inverses of each other"): Test inputs of different lengths. Arun Isaac
2020-05-25email: Decode base64 bytevector without converting to string.•••The new base64 decoder can directly operate on bytevectors in addition to strings. This feature may not remain forever, but it greatly improves performance. So, it stays for now. * email/email.scm (decode-body): Decode base64 encoded body directly without converting to an intermediate string. Arun Isaac
2020-05-25email: Do not filter base64 encoded bytes before decoding.•••The new base64 decoder skips invalid characters safely. * email/email.scm (decode-body): Do not filter base64 encoded body to remove invalid base64 characters. Arun Isaac
2020-05-25base64: Reimplement from scratch.•••* email/base64.scm: Replace file. Arun Isaac
2020-05-25utils: Do not match sequence byte by byte in read-bytes-till.•••* email/utils.scm (bytevector-match, bytevector-overlap, lookahead-bytevector-n): New functions. (read-bytes-till): Do not match sequence byte by byte. Process blocks of bytes at a time. Arun Isaac
2020-05-25utils: Introduce the not-end-let utility.•••* email/utils.scm (not-end-let): New macro. * .dir-locals.el (scheme-mode): Indent not-end-let correctly. Arun Isaac
2020-05-25utils: Do not return eof if matched at beginning.•••* email/utils.scm (read-while, read-bytes-till): Do not return eof if matched at beginning. Return empty string or bytevector respectively. * tests/utils.scm ("read-bytes-till returns empty bytevector on match at beginning", "read-while returns empty string on match at beginning"): New tests. Arun Isaac
2019-12-16base64: Import only the required rnrs modules.•••* email/base64.scm: Import (rnrs arithmetic bitwise), (rnrs arithmetic fixnums), (rnrs base), (rnrs bytevectors) and (rnrn io ports), not all of (rnrs). Arun Isaac
2019-12-04email: Handle blank Subject headers.•••* email/email.scm (post-process-fields): Treat blank Subject headers as having the null string as value. * tests/email.scm ("blank Subject header must be treated as having the null string as value"): New test. Reported-by: Ricardo Wurmus <rekado@elephly.net> Arun Isaac
2019-10-09email: Return keywords header as a list.•••* email/email.scm (parse-email-headers): Return keywords header as a list of strings. * tests/email.scm ("keywords header must be a list"): New test. Arun Isaac
2019-10-08Reindent calls to call-with-port.•••* email/email.scm (body->mime-entities, email->headers+body): Reindent calls to call-with-port. * email/quoted-printable.scm (quoted-printable-encode, q-encoding-encode): Reindent calls to call-with-port. * tests/utils.scm ("read-bytes-till returns eof-object on end of file"): Reindent call to call-with-port. Arun Isaac
2019-10-08email: Override invalid charset more strongly.•••* email/email.scm (post-process-content-type): Use alist-combine to override charset more strongly than just appending to the alist. * tests/email.scm ("tolerate invalid charset"): Update test. Arun Isaac
2019-10-08email: Introduce alist union utility.•••* email/utils.scm (alist-combine): New function. (alist-delete*): Delete function. * email/email.scm (add-default-headers, add-default-mime-entity-headers): Use alist-combine. Arun Isaac
2019-10-08email: Deduplicate post processing of header fields.•••* email/email.scm (post-process-fields): New function. (parse-mime-entity, decode-body): Invoke post-process-fields. Arun Isaac
2019-10-02email: Tolerate decoding errors in body.•••* email/email.scm (decode-body): Tolerate decoding errors in the body using the substitute conversion strategy. * tests/email.scm ("tolerate decoding errors in body"): New test. Arun Isaac
2019-10-01email: Tolerate invalid charset.•••* email/email.scm (post-process-content-type): If charset is invalid, assume default UTF-8 as charset. * tests/email.scm ("tolerate invalid charset"): New test. Reported-by: Ricardo Wurmus <rekado@elephly.net> Arun Isaac
2019-09-28email: Tolerate decoding errors in MIME encoded words.•••* email/email.scm (decode-mime-encoded-word): Tolerate decoding errors in MIME encoded words using the substitute conversion strategy. * tests/email.scm ("tolerate decoding errors in MIME encoded words"): New test. Reported-by: Christopher Baines <mail@cbaines.net> Arun Isaac
2019-09-28email: Remove duplicate unbracketed-angle-addr definition.•••* email/email.scm (unbracketed-angle-addr): Delete duplicate definition. Arun Isaac
2019-09-23email: Update mbox->emails docstring.•••The earlier docstring was one meant for read-next-email-in-mbox. * email/email.scm (mbox->emails): Update docstring. Arun Isaac
2019-09-23email: Add read-next-email-in-mbox docstring.•••* email/email.scm (read-next-email-in-mbox): Add docstring. Arun Isaac
2019-09-23email: Tolerate non-ASCII non-UTF-8 characters in headers.•••* email/email.scm (email->headers+body): If non-ASCII non-UTF-8 characters occur in the headers, do not raise a decoding error. Work around using the substitute conversion strategy. * tests/email.scm ("tolerate non-ASCII characters in headers"): Rename to "decode utf-8 characters in headers". ("tolerate non-ascii non-utf-8 characters in headers"): New test. Reported-by: Christopher Baines <mail@cbaines.net> Arun Isaac
2019-09-17email: Tolerate non-ASCII characters in headers.•••We tolerate non-ASCII characters in headers in order to support Emacs message mode parens style addresses. * email/email.scm (email->headers+body): Read headers as UTF-8 characters. * tests/email.scm ("tolerate non-ascii characters in headers"): New tests. Reported-by: Christopher Baines <mail@cbaines.net> Arun Isaac
2019-08-07doc: Document mbox->emails.•••* doc/guile-email.texi (Reading Email): New chapter. * email/email.scm (mbox->emails): Add docstring. Arun Isaac
2019-08-07utils: Clarify read-while docstring.•••* email/utils.scm (read-while): Clarify docstring. Arun Isaac
2019-07-28email: Improve comment about default charset.•••* email/email.scm (post-process-content-type): Mention that RFC6657 specifies UTF-8 as the default charset only for text/* media types. Arun Isaac
2019-07-28email: Read mboxes as bytevectors.•••* email/email.scm (read-next-email-in-mbox): Read bytes from mboxes, not characters. Arun Isaac
2019-07-28utils: Return eof-object from read-bytes-till on end of file.•••* email/utils.scm (read-bytes-till): Return eof-object, not #vu8(), on end of file. * tests/utils.scm: New file. * Makefile.am (SCM_TESTS): Register it. Arun Isaac
2019-07-28email: Decode MIME entities without headers.•••* email/email.scm (email->headers+body): If there are no headers, return "" as headers not an eof-object. (parse-email-body): Parse headers of parent entity or email to parse-mime-entity. (add-default-mime-entity-headers): New function. (parse-mime-entity): Use add-default-mime-entity-headers instead of add-default-headers. Handle MIME entities without headers. * tests/email.scm ("decode MIME entity without headers"): New test. Arun Isaac
2019-07-28email: Support email with mixed encoding of characters.•••Prior to this, parse-email would accept email in the form of a string. A string is constrained to use the same encoding for all its characters whereas an email can have characters encoded using different encoding schemes. Therefore, it is more correct that parse-email deals with bytevectors instead of strings. * email/utils.scm (read-bytes-till): New function. * email/email.scm (body->mime-entities, email->headers+body, decode-body): Deal with emails as bytevectors instead of strings. (parse-mime-entity): Rename text argument to bv. (parse-email, parse-email-body): Overload to handle input in the form of a string or bytevector. * doc/guile-email.texi (Parsing e-mail): Document overloading of parse-email and parse-email-body. * tests/email.scm ("handle truncated multipart message gracefully"): Deal in bytevectors instead of strings. ("email with 8 bit encoding and non UTF-8 charset", "multipart email with a 8 bit encoding and non UTF-8 charset part"): New tests. * tests/email-with-8bit-encoding-and-non-utf8-charset, tests/multipart-email-with-a-8bit-encoding-and-non-utf8-charset-part: New files. Reported-by: Jack Hill <jackhill@jackhill.us> Arun Isaac
2019-07-26email: Match mime-entity-fields only against headers.•••* email/email.scm (parse-mime-entity): Match mime-entity-fields only against the headers, not the whole email. Arun Isaac
2019-07-26email: Import all of (email utils).•••* email/email.scm: Import all of (email utils), not a subset of the exported functions. Arun Isaac
2019-07-21email: Decode MIME encoded words in Subject header.•••Prior to this, MIME encoded words in the Subject header were not decoded. * email/email.scm (parse-email-headers): Decode MIME encoded words in Subject header. * tests/email.scm ("decode MIME encoded words in Subject header"): New test. Reported-by: Ricardo Wurmus <rekado@elephly.net> Arun Isaac
2019-06-25email: Fix typo in docstring of parse-mime-entity.•••* email/email.scm (parse-mime-entity): Replace "a" with "an" in docstring. Arun Isaac
2018-11-13email: Support emacs message mode parens style addresses.•••* email/email.scm (define-comment-pattern, define-cfws-pattern, define-dot-atom-pattern, define-domain-pattern, define-addr-spec-pattern): New macros. (captured-comment, captured-cfws, captured-dot-atom, captured-domain, captured-addr-spec): New patterns. (mailbox): Use captured-addr-spec instead of addr-spec. (post-process-mailbox): Handle emacs message mode parens style addresses. Arun Isaac
2018-11-13email: Discard angle brackets in address fields only.•••* email/email.scm (define-angle-addr): New macro. (unbracketed-angle-addr): New pattern. (name-addr): Use unbracketed-angle-addr instead of angle-addr. (post-process-mailbox): Do not trim angle brackets from address. That is now handled by the grammar itself. Arun Isaac
2018-11-13email: Deduplicate email address parsing.•••* email/email.scm (post-process-mailbox): New function. (parse-email-address): Call post-process-mailbox instead of reimplementing address parsing using regular expressions. (parse-email-headers): Call post-process-mailbox. Arun Isaac