How to upload large email attachments to a CRM system

We recently came across a situation where an Outlook plugin we had built for a SAAS customer started reporting ‘System.OutOfMemoryException’ errors during the synchronisation routine.  The plugin was uploading emails and attachments to the SAAS CRM system via an API, which isn’t an unusual scenario for the systems we build, so we were slightly surprised. On closer examination, the errors were only occurring for large files (which in our case meant over 100Mb).

Typically when we need to upload files, we use ‘base 64 encoding’ and then upload them using http requests. There are other methods, but many servers expect base64 encoded data when uploading files. Below is a simple code for encoding the file to a base64 string.

 

 

Figure 1: Encode file to base64 string

However, with large files, the code above was giving us the  ‘System.OutOfMemoryException’ error, so we needed another solution.

Chunking the Data with a MemoryStream

Initially we tried the method shown below (Figure 2) that reads data chunk by chunk from the large file and writes that data to a MemoryStream.

Figure2: Encode file to base64 string and save it to MemoryStream

Up to a certain file size this method worked well, but when the file was more than 200Mb, we again got the same error. It turns out MemoryStream uses an internal byte[] buffer to store data. It initializes the buffer with a pre-set initial value, but if it reaches its limit, it needs to be re-sized. Instead of just grabbing a bit more memory, it creates a new buffer twice the size of the previous buffer, and then copies data from the old buffer to the new buffer. For example if the length of a buffer is 200Mb and new data needs to be written, .Net has to find another 400Mb block of data. So the total memory required is at least be 600Mb.

Back to the drawing board!

Using a FileStream instead of MemoryStream

We then looked at the FileStream class in the System.IO namespace. It helps reading from, writing to and closing files. While MemoryStream writes data to the internal memory, FileStream writes data to a file. (I guess the clue is in the name.)  As FileStream writes all the data to a file, there is no inherent memory limit, so using FileStream avoids the Out-of-Memory exceptions.

Figure 3: Encode file to base64 string and save it to file using FileStream

Uploading base64 encoded data to http Server

Now the base64 encoded data from the original file is saved in to another external file. It can then be read from the saved file and uploaded to the server using an http POST request (figure 4).

In this code, we can read the data in small chunks and can write these chunks to the request stream.

Figure4: Read base64 encoded data from a file and upload to http server

Summary

In this article, a large file is encoded to base64string and saved in a file using FileStream, then the source file is read chunk by chunk and written into the target stream. This is a good method for uploading very large files as base64 encoded data to a server.