0 votes
by (910 points)
edited by

Hi,

we're having problems parsing a message that contains comments (or illegaly indentend lines, however you wish to see it) for the "Date:" header field:

Date: 20 Apr 2022 10:03:29 +0200
 To: abc@def.com
 From: <ghi@jkl.com>

Rebex Secure Mail claims there's an illegal character at position 29, whereas Outlook, Windows Mail and Thunderbird all parse the message "correctly", omitting the "To:" and "From:" headers as they can be interpreted as comments in CWFS syntax according to RFC 2822.

How can I achieve this kind of ignoring of (wrongly made) comments in header fields, without subscribing to the UnparsableHeader event and doing everything myself?

Thanks,
Olaf

Applies to: Rebex Secure Mail

1 Answer

0 votes
by (148k points)

Hi,

those are not comments. According to RFC 2822 CFWS syntax, comments would be enclosed in parentheses, which is not the case here. Message with comments would be parsed without issues:

Date: 20 Apr 2022 10:03:29 +0200
 (To: abc@def.com)
 (From: <ghi@jkl.com>)

Indenting a line actually makes it a continuation of the previous line - therefore, the indented 'To' and 'From' lines are actually part of the 'Date' header, so equivalent to this:

Date: 20 Apr 2022 10:03:29 +0200 To: abc@def.com From: <ghi@jkl.com>

Those strings are not comments, they are a part of the Data header, and that's why our parser fails.

Outlook, Windows Mail and Thunderbird are notably very benevolent about what they parse - for example, when I download a PNG image and save it as 'message.eml', Outlook does not complain at all and pretends it's a message with a body text of "Ê#d|eftxè‘oùÓyÓ! I¼" - we don't think emulating that in our mail library would be a good idea.

However, you can make our parser more benevolent by enabling MailMessage.Settings.IgnoreUnparsableHeaders option - this will result in the parser ignoring the wrong 'Date' header (along with the emvedded 'To' and 'From' strings).

Unlike Rebex Mail with IgnoreUnparsableHeaders, Outlook's 'Date' header parser actually discards the extra text and parses the beginning. We'll try making our 'Date' parser a bit more benevolent as well for the next release.

by (910 points)
I played around with IgnoreUnparsableHeaders a little, and came up with a solution for this specific case. However, the headers seem to be parsed on-demand or later, not when calling mail.Load. Is there a way to force the parsing of all headers? Or is it enought to access mail.Date for example?
by (148k points)
Try enabling MailMessage.Settings.ProcessAllHeaders to parse all headers. However, we have already made the Date parser more benevolent, and this will be part of the next release soon. If you would like to try a preview build, please let me know.
by (910 points)
When I set ProcessAllHeaders to true, the UnparsableHeader event is not called at all. Right now I just do "var date = mail.Date;" to force the parsing, but I was wondering if there is a better way. And it's not just about the date stuff, I also try to parse the indented lines from the example to see if they are valid headers and add them if they are.
by (148k points)
Oops, sorry! Apparently, the "Date" header (along with "Resent-Date" and "Delivery-Date") does not raise UnparsableHeader event when ProcessAllHeaders is enabled, which seems to be an unwanted side-effect of a 13-year-old workaround that no one noticed until now. We will fix that as well for the next release. In the meantime, consider using the ParsingHeader event - this one is called for headers prior to parsing, making it possible to fix the raw data of "Date" (or other) headers.
...