Age | Commit message (Collapse) | Author |
|
* email/email.scm (handle-invalid-headers): New function.
(parse-email-headers): Handle invalid headers.
* tests/email.scm ("Assume application/octet-stream Content-Type if
Content-Transfer-Encoding is unrecognized"): New test.
|
|
Using an empty bytevector no longer throws an exception since Guile
commit 5ea8c69e9153a970952bf6f0b32c4fad6a28e839.
* email/email.scm (post-process-content-transfer-encoding): Use a
bytevector of unit length to test the charset validity.
Signed-off-by: Arun Isaac <arunisaac@systemreboot.net>
|
|
* email/email.scm (obs-phrase): Replace word with cfws-captured-word.
* tests/email.scm ("Parse names with more than two words"): New test.
|
|
* email/email.scm (define-cfws-pattern): Indent better.
|
|
* email/email.scm (id-left, id-right): Give higher precedence to
obsolete patterns.
|
|
* email/email.scm (obs-phrase-list, obs-utext, obs-unstruct,
obs-optional): New macros.
(unstructured, in-reply-to, references, keywords, optional-field):
Include obsolete patterns.
|
|
* email/email.scm (received): Include obsolete pattern.
(parse-mime-entity): Post process obsolete received forms.
|
|
* email/email.scm (define-atom-pattern): Do not capture cfws unless
specified.
(atom): Do not specify cfws.
(define-dot-atom-pattern): Do not capture cfws.
(define-word-pattern): New macro.
(cfws-captured-atom, cfws-captured-word): New patterns.
(obs-phrase): Use cfws-captured-word.
(received-token): Capture all.
(parse-mime-entity): Post process received and received-token.
* tests/email.scm ("parse email headers"): Fix test.
|
|
* email/email.scm (obs-day-of-week, obs-day, obs-year, obs-hour,
obs-minute, obs-second, obs-zone): New macros.
(day-of-week, day, year, hours, minutes, seconds, zone): Include
obsolete pattern.
(parse-email-headers): Handle obsolete two and three digit years, and
alphabetic time zone specifiers.
* tests/email.scm ("RFC5322 A.6.2. Obsolete dates"): New test.
|
|
* email/email.scm (obs-qp, obs-fws, obs-no-ws-ctl, obs-ctext,
obs-qtext, obs-phrase, obs-local-part, obs-dtext, obs-domain,
obs-domain-list, obs-route, obs-angle-addr, captured-atom,
captured-obs-domain, captured-domain, obs-mbox-list, obs-group-list,
obs-addr-list, obs-id-left, obs-id-right): New patterns.
(quoted-pair, fws, ctext, qtext, phrase, dtext,
define-angle-addr-pattern, mailbox-list, group-list, address-list,
define-field-pattern, from, sender, bcc, id-left, id-right,
resent-from, resent-sender, resent-bcc, obs-resent-rply): Include
obsolete pattern.
(define-printable-ascii-character-pattern-with-obsolete,
define-atom-pattern, define-obs-domain-pattern): New macros.
(define-domain-pattern): Accept obs-domain as a new argument.
(fields): Include obs-resent-rply.
* tests/email.scm ("RFC5322 A.6.1. Obsolete addressing"): New test.
("parse email addresses with period in name"): Mark as passing.
|
|
* tests/base64.scm ("base64 random bytevector: base64-encode and
base64-decode are inverses of each other", "base64 random
bytevector: encoded output should not be more than 76 columns wide",
"base64 random bytevector: encoded output must only consist of
characters from the base64 alphabet"): Test inputs of different lengths.
* tests/quoted-printable.scm ("quoted-printable random bytevector:
quoted-printable-encode and quoted-printable-decode are inverses of
each other", "quoted-printable random bytevector: encoded output
should not be more than 76 columns wide", "quoted-printable random
bytevector: encoded output must only consist of printable ASCII
characters", "q-encoding random bytevector: q-encoding-encode and
q-encoding-decode are inverses of each other"): Test inputs of
different lengths.
|
|
The new base64 decoder can directly operate on bytevectors in addition
to strings. This feature may not remain forever, but it greatly
improves performance. So, it stays for now.
* email/email.scm (decode-body): Decode base64 encoded body directly
without converting to an intermediate string.
|
|
The new base64 decoder skips invalid characters safely.
* email/email.scm (decode-body): Do not filter base64 encoded body to
remove invalid base64 characters.
|
|
* email/base64.scm: Replace file.
|
|
* email/utils.scm (bytevector-match, bytevector-overlap,
lookahead-bytevector-n): New functions.
(read-bytes-till): Do not match sequence byte by byte. Process blocks
of bytes at a time.
|
|
* email/utils.scm (not-end-let): New macro.
* .dir-locals.el (scheme-mode): Indent not-end-let correctly.
|
|
* email/utils.scm (read-while, read-bytes-till): Do not return eof if
matched at beginning. Return empty string or bytevector respectively.
* tests/utils.scm ("read-bytes-till returns empty bytevector on match
at beginning", "read-while returns empty string on match at
beginning"): New tests.
|
|
* email/base64.scm: Import (rnrs arithmetic bitwise), (rnrs arithmetic
fixnums), (rnrs base), (rnrs bytevectors) and (rnrn io ports), not all
of (rnrs).
|
|
* email/email.scm (post-process-fields): Treat blank Subject headers
as having the null string as value.
* tests/email.scm ("blank Subject header must be treated as having the
null string as value"): New test.
Reported-by: Ricardo Wurmus <rekado@elephly.net>
|
|
* email/email.scm (parse-email-headers): Return keywords header as a
list of strings.
* tests/email.scm ("keywords header must be a list"): New test.
|
|
* email/email.scm (body->mime-entities, email->headers+body): Reindent
calls to call-with-port.
* email/quoted-printable.scm (quoted-printable-encode,
q-encoding-encode): Reindent calls to call-with-port.
* tests/utils.scm ("read-bytes-till returns eof-object on end of
file"): Reindent call to call-with-port.
|
|
* email/email.scm (post-process-content-type): Use alist-combine to
override charset more strongly than just appending to the alist.
* tests/email.scm ("tolerate invalid charset"): Update test.
|
|
* email/utils.scm (alist-combine): New function.
(alist-delete*): Delete function.
* email/email.scm (add-default-headers,
add-default-mime-entity-headers): Use alist-combine.
|
|
* email/email.scm (post-process-fields): New function.
(parse-mime-entity, decode-body): Invoke post-process-fields.
|
|
* email/email.scm (decode-body): Tolerate decoding errors in the body
using the substitute conversion strategy.
* tests/email.scm ("tolerate decoding errors in body"): New test.
|
|
* email/email.scm (post-process-content-type): If charset is invalid,
assume default UTF-8 as charset.
* tests/email.scm ("tolerate invalid charset"): New test.
Reported-by: Ricardo Wurmus <rekado@elephly.net>
|
|
* email/email.scm (decode-mime-encoded-word): Tolerate decoding errors
in MIME encoded words using the substitute conversion strategy.
* tests/email.scm ("tolerate decoding errors in MIME encoded words"):
New test.
Reported-by: Christopher Baines <mail@cbaines.net>
|
|
* email/email.scm (unbracketed-angle-addr): Delete duplicate
definition.
|
|
The earlier docstring was one meant for read-next-email-in-mbox.
* email/email.scm (mbox->emails): Update docstring.
|
|
* email/email.scm (read-next-email-in-mbox): Add docstring.
|
|
* email/email.scm (email->headers+body): If non-ASCII non-UTF-8
characters occur in the headers, do not raise a decoding error. Work
around using the substitute conversion strategy.
* tests/email.scm ("tolerate non-ASCII characters in headers"): Rename
to "decode utf-8 characters in headers".
("tolerate non-ascii non-utf-8 characters in headers"): New test.
Reported-by: Christopher Baines <mail@cbaines.net>
|
|
We tolerate non-ASCII characters in headers in order to support Emacs
message mode parens style addresses.
* email/email.scm (email->headers+body): Read headers as UTF-8
characters.
* tests/email.scm ("tolerate non-ascii characters in headers"): New
tests.
Reported-by: Christopher Baines <mail@cbaines.net>
|
|
* doc/guile-email.texi (Reading Email): New chapter.
* email/email.scm (mbox->emails): Add docstring.
|
|
* email/utils.scm (read-while): Clarify docstring.
|
|
* email/email.scm (post-process-content-type): Mention that RFC6657
specifies UTF-8 as the default charset only for text/* media types.
|
|
* email/email.scm (read-next-email-in-mbox): Read bytes from mboxes,
not characters.
|
|
* email/utils.scm (read-bytes-till): Return eof-object, not #vu8(), on
end of file.
* tests/utils.scm: New file.
* Makefile.am (SCM_TESTS): Register it.
|
|
* email/email.scm (email->headers+body): If there are no headers,
return "" as headers not an eof-object.
(parse-email-body): Parse headers of parent entity or email to
parse-mime-entity.
(add-default-mime-entity-headers): New function.
(parse-mime-entity): Use add-default-mime-entity-headers instead of
add-default-headers. Handle MIME entities without headers.
* tests/email.scm ("decode MIME entity without headers"): New test.
|
|
Prior to this, parse-email would accept email in the form of a
string. A string is constrained to use the same encoding for all its
characters whereas an email can have characters encoded using
different encoding schemes. Therefore, it is more correct that
parse-email deals with bytevectors instead of strings.
* email/utils.scm (read-bytes-till): New function.
* email/email.scm (body->mime-entities, email->headers+body,
decode-body): Deal with emails as bytevectors instead of strings.
(parse-mime-entity): Rename text argument to bv.
(parse-email, parse-email-body): Overload to handle input in the form
of a string or bytevector.
* doc/guile-email.texi (Parsing e-mail): Document overloading of
parse-email and parse-email-body.
* tests/email.scm ("handle truncated multipart message gracefully"):
Deal in bytevectors instead of strings.
("email with 8 bit encoding and non UTF-8 charset", "multipart email
with a 8 bit encoding and non UTF-8 charset part"): New tests.
* tests/email-with-8bit-encoding-and-non-utf8-charset,
tests/multipart-email-with-a-8bit-encoding-and-non-utf8-charset-part:
New files.
Reported-by: Jack Hill <jackhill@jackhill.us>
|
|
* email/email.scm (parse-mime-entity): Match mime-entity-fields only
against the headers, not the whole email.
|
|
* email/email.scm: Import all of (email utils), not a subset of the
exported functions.
|
|
Prior to this, MIME encoded words in the Subject header were not
decoded.
* email/email.scm (parse-email-headers): Decode MIME encoded words in
Subject header.
* tests/email.scm ("decode MIME encoded words in Subject header"): New
test.
Reported-by: Ricardo Wurmus <rekado@elephly.net>
|
|
* email/email.scm (parse-mime-entity): Replace "a" with "an" in
docstring.
|
|
* email/email.scm (define-comment-pattern, define-cfws-pattern,
define-dot-atom-pattern, define-domain-pattern,
define-addr-spec-pattern): New macros.
(captured-comment, captured-cfws, captured-dot-atom, captured-domain,
captured-addr-spec): New patterns.
(mailbox): Use captured-addr-spec instead of addr-spec.
(post-process-mailbox): Handle emacs message mode parens style addresses.
|
|
* email/email.scm (define-angle-addr): New macro.
(unbracketed-angle-addr): New pattern.
(name-addr): Use unbracketed-angle-addr instead of angle-addr.
(post-process-mailbox): Do not trim angle brackets from address. That
is now handled by the grammar itself.
|
|
* email/email.scm (post-process-mailbox): New function.
(parse-email-address): Call post-process-mailbox instead of
reimplementing address parsing using regular expressions.
(parse-email-headers): Call post-process-mailbox.
|
|
* email/email.scm (parse-email-address): Fix typo in examples in
parse-email-address docstring. The returned value must be an
association list of pairs, not of lists.
|
|
* email/utils.scm (read-while)[read-while-loop]: Use else, instead
of #t, for the default cond clause.
|
|
* email/email.scm (angle-addr): Capture "<" and ">".
(parse-email-headers): Do not discard trace fields. Trim "<" and ">"
from angle-addr in mailbox, but not from trace fields.
|
|
* email/email.scm (body->mime-entities)[read-mime-entity]: Check for
eof-object so that truncated messages are handled gracefully without
raising an error.
|