Memory consumption when parsing mails with large attachments

+1 vote
asked Mar 14, 2014 by obartelt (820 points)
edited Mar 21, 2014

When parsing a mail that has large attachments (for example 100 mb), memory consumption is quite high, probably, because you (need to) load all message data into memory. Is there a way to reduce the memory consumption by not having the attachment data in memory (but still being able to save it to file/stream, or all attachments to a specified directory while parsing, if neccessary)? I tried MimeOptions.DoNotPreloadAttachments, but that only seems to work with SMTP, if I understand correctly.

This would be really important to us, because we have customers sending attachments with multiple hundred mb in size (not a good idea, I know, but you can't tell them ;-)) and are currenty looking for a library that can handle sending, receiving, and parsing such large mails reliably.

Sincerely, Olaf Bartelt CTO AMTANGEE AG

Applies to: Rebex Secure Mail

6 Answers

+1 vote
answered Mar 14, 2014 by Tomas Knopp (58,580 points)
edited Mar 14, 2014

Hello Olaf,

with the current version of Rebex Secure Mail, it is unfortunatelly not possible to reduce the memory consumption by not having the attachment data in memory and work directly with files on disk, as the component loads everything including the attachments into memory. Adding support which would enable this into Rebex Secure Mail would require quite a lot of work which we are not sure would eventually pay off.

On the other hand, it should be relatively easy to enhance Rebex Secure Mail to include a temporary on-disk storage for large attachments. When parsing mail the large attachments would get stored into the temporary on-disk storage (instead of RAM) and the storage would be deleted after the MailMessage had been disposed.

Please let me know whether the suggested solution would suit your needs as well.

0 votes
answered Mar 15, 2014 by obartelt (820 points)
edited Mar 15, 2014

That would suit our needs perfectly, in fact that's kind of the way we do it right now with the components we are currently using (parse the message and let it save the individual attachments to a specified directory, then import them into our database and delete them/the temp directory).

If you could implement that, that would be awesome! :-)

0 votes
answered Mar 17, 2014 by Tomas Knopp (58,580 points)
edited Mar 17, 2014

We were thinking one more time about the actual implementation today. We were thinking once again what solution we can provide which would be feasible for us and at the same time cover your needs for saving the attachments to disk instead of memory.

We have come to a conclusion that an easy MimeReader object which would provide essential access to the MIME emails (i.e. headers and MIME parts in a tree) would be the solution to go. However, there is one drawback:

The easy MimeReader would only allow for reading of simple MIME (i.e. there wouldn't be support for MSG, TNEF (winmail.dat), signed EML with undetached signature, enveloped MIME, and messages with UUENCODED attachments).

Please let me know whether it would still be acceptable for you. Thank you for your response!

0 votes
answered Mar 18, 2014 by obartelt (820 points)
edited Mar 18, 2014

No, that wouldn't help us, since it could very well be that the large mails are signed, etc.

Without knowing your code, it should be fairly easy to implement in a way that you decode the attachments to disk, if a path/setting is specified, and then store the temporary filename inside the attachment object, so that Attachment.Save, etc. just copies the file, and methods/properties that deal directly with the attachment data just throw a NotSupportedException or something like that. That should be easy to implement and work for everyone, shouldn't it?

0 votes
answered Mar 19, 2014 by Tomas Knopp (58,580 points)
edited Mar 19, 2014

The actual problem would be with SMIME and enveloped MIME emails - the attachment data have to be processed first before being able to save the result.

The processing is currently done in the RAM memory. So without actually adding support for the 'on-disk storage' into the SignedData and the EnvelopedData public classes, there would be no benefit, as the RAM would get consumed by the processing of SMIME emails.

On the other hand, implementing the 'on-disk storage' into the SignedData and EnvelopedData is, simply put, a too complicated task for us. So we finally decided not to implement it as the demand for this feature is actually pretty low and we think it would not pay off. Sorry!

0 votes
answered Mar 19, 2014 by obartelt (820 points)
edited Mar 21, 2014

No problem, we can still run larger "simple" mails through the old engine for pre-processing, strip the attachments and then run them through your parser, that should be fairly easy to implement. Thanks anyway!

commented Mar 20, 2014 by Tomas Knopp (58,580 points)
edited Mar 20, 2014

Thank you for your response. Do I understand correctly that no action is currently needed in this case on Rebex side?

commented Mar 21, 2014 by obartelt (820 points)
edited Mar 21, 2014

Yes, that's correct.