Extracting html & embedded image

0 votes
asked Jan 18, 2017 by ys (120 points)

Hi,

With this code:

var mail = new Rebex.Mail.MailMessage();
mail.Load(@"c:\temp\some-email-with-html-content.eml");
File.WriteAllText(@"c:\temp\some-email-with-html-content.htm", mail.BodyHtml);

I can extract the html but the embedded base64'd images in the .eml file is not saved as embedded images .htm (it's saved as cid:...)

Is there a way to do this ?

Thanks !

1 Answer

+1 vote
answered Jan 18, 2017 by Lukas Matyska (44,570 points)

The embedded images are stored in the MailMessage.Resources collection.

To convert HTML mail to ordinary HTML page, you need to manipulate the HTML mail body. You have to replace cid:ID with appropriate string. You can either extract embedded images to files and use the filename instead of cid:ID or embed the image data into HTML page directly like this:

foreach (var res in mail.Resources)
{
    if (res.ContentId == null || 
        res.MediaType == null || 
        !res.MediaType.StartsWith("image/", StringComparison.OrdinalIgnoreCase))
        continue;

    MemoryStream ms = new MemoryStream();
    using (var content = res.GetContentStream())
    {
        content.CopyTo(ms);
    }
    byte[] data = ms.ToArray();
    string cidString = string.Format("cid:{0}", res.ContentId.Id);
    string dataString = string.Format("data:{0};base64,{1}", res.MediaType, Convert.ToBase64String(data));

    // replace image link (cid:) with image data (data:)
    mail.BodyHtml = mail.BodyHtml.Replace(cidString, dataString);
}

Please note, that this code only shows the way, how to do it. To make it robust you should handle letter case and spaces in "cid:" string.

commented Jun 27 by dsouzac (360 points)
Hi Lukas,

I have a query on this topic so started directly in comment section to continue the context.

Is it outlook is not able to understand this image data embedded in email body as outlook showing [screen clipping] instead of image?  (whereas i can see images if that email is opened in Thunderbird)

Could you please share knowledge. looking forward if the embedded image need more care so that can be viewed properly in outlook.

Just use above same example code for inbound email, and reply to it after putting image from resource to mail body.
commented Jun 27 by Lukas Matyska (44,570 points)
The base64 embedded images are not supported from Outlook 2007.
For list of mail clients supporting this format, see https://www.campaignmonitor.com/blog/email-marketing/2013/02/embedded-images-in-html-email/
commented Jun 28 by dsouzac (360 points)
thanks, I got the answer.
...