Yet Another Email Client (LINQ to IMAP)

Alexander Wieser

4.93/5 (22 votes)

Feb 17, 2011

Ms-PL

13 min read

88252

220

Equinox is an SMTP/IMAP client running on the .NET Framework and Mono.

Download latest source from CodePlex

Preamble

It's been four months since I initially wrote this article in order to spread the word and excite the masses. Well, it worked, at least partially. Four months ago, this was a prototype, it worked with limitations, but it was far away from something reliable. I decided to update this article to reflect recent changes and reignite the fire (cough). For those who are reading this for the first time, this is not a 'how do I create a funky email library' kinda article, but more of a 'show and tell' thing. There is just way too much stuff, I wouldn't even know where to start explaining things, but ... the code is open, you can always have looksies, and if you have questions, I will happily try to answer them, because I'm a nice guy ;) Instead of writing a manual, I will mostly focus on the unique parts, mostly the IMAP client and its embedded LINQ engine, and show you how it can be used to do all that stuff other libraries can't.

What is it?

The Equinox project is a messaging library targeting .NET 4.0 and Mono 2.8. The library contains full implementations for IMAP, SMTP, and yes finally, the POP3 protocols. In addition, the library offers a unique and fully integrated LINQ provider for the IMAP protocol. The project is written in C#, fully managed, and licensed under Ms-PL.

What's new?

This is an update, so obviously something has changed. Very true...

First of all, there where bugs and although I'm pretty confident we haven't caught them all yet, a great many, over 100, have been fixed.
In addition, the library received POP3 support. As the counterparts IMAP and SMTP, POP3 too supports TLS/SSL and several SASL authentications, download progress events ... all bells and whistles.
We added support for not just receiving, but also sending embedded content.
We implemented the IDLE command for the IMAP client, because pushing mails is better than not ;)
We increased robustness of the body struct parser, yeah there is one, and it now supports the most strangest and weird server responses.
We optimized the LINQ query generation to avoid unnecessary round trips with the server, making it even faster.
And the perhaps most important part is, we almost completed the documentation for the project on the CodePlex site.

The list goes on, but instead of boring you with a long laundry list, we've setup a page displaying all capabilities, and to be fair, listed even items not supported by the library. You can find the list inside the Content & Capabilities document.

Why is it?

The reason for my creating this messaging library is the sad fact that no Open Source lib out there did live up to my expectations. Reality struck hard. I tried several libs and most of them performed fine for about 70% to 80% of all mails, but in the end, all of them failed in certain aspects.

Some had problems with encodings or didn't even encode at all, some performed poorly in parsing more complex MIME constructs, many only supported a small subset of the protocol's capabilities, some just seemed wrong, not a single one had a working parser for the body structure ... and the feature that was neglected most of the time was an adequate implementation of the fetch command. The fetch command is the most important and complex part of the IMAP protocol because it is the only command which produces dynamic responses depending on the query and the server implementation, which is probably the reason why most implementations only offered a simple fetch-headers or fetch-all solution. On first glance, this seems to be enough, but if you take a closer look, you will see that there's more going on. Truth is, all email clients can display and handle mails, be it Thunderbird, Outlook, Lotus Notes, Apple Mail, K9 Mail, or whatever other client you prefer, although I'm not so sure about Lotus Notes. It is however a difference if someone can process 20 mails in 2 minutes, because of a smart client that only fetches what needs to be fetched, displays what is important to you and omits the rest, or in 2 hours, because your client has to fetch the 20 MB through your tethered UMTS device. This seems trivial on first glance, but believe me, it adds up and it only gets worse the lower the bandwidth is you have at your disposal. I bring this up since you may have noticed that the desktop market shrinks, while the mobile computer market grows rapidly. On a desktop machine, we are usually connected through a broadband internet connection where it hardly matters if we download 10K, 100K, or even 1MB of data, but there are situations were it does matter. Of course, this library is not intended to run on mobile phones, it could with some sacrifices, but that is a different story. To make matters more clear, it may run on tablets or pads using Linux, OS X, or Windows. On these mobile computers, we don't always have the luxury of high speed broadband connections or traffic flats where we can say ... what 5MB attachment?

Of course, every library I inspected did give the user control to sent custom, so called "raw commands", to the server. Essentially, you can create this surgical efficient query by hand and send it to the server. Unfortunately, the response you'll get will also be very "raw", it will mostly be a MIME encoded part of the message and this is where the trouble begins.

The following command is the LIST command, which lists all mailboxes contained in a given mailbox, obviously.

Send("LIST #news.comp.mail.misc \"\");

It's a one liner. I don't need a library to do this for me. I need a lib for the complex stuff, parsing MIME structures, identifying body parts, encoding and decoding for transfer, but what's the point of having a library when it comes with a "do it yourself" policy on most or all of those parts? Although some of the libraries I tested where really well designed and easy to use, I always ended up inside a forest full of Regular Expressions and string comparisons, doing it myself when it came to fetching something else than everything. With all this in mind, I'm going to share this project and I will try to explain how it differs from most libraries out there.

Structure of the libary

The library is split into five main assemblies:

Crystalbyte.Equinox.Core
Crystalbyte.Equinox.Mime
Crystalbyte.Equinox.Imap
Crystalbyte.Equinox.Smtp
Crystalbyte.Equinox.Pop3

This way the user can choose which libraries he needs and omit the rest. This isn't about size since all assemblies together take up like 220 KB; by dropping the IMAP and POP3 assembly if you only want to send mails, it comes down to about 90 KB. Separation forces you to properly design your application. The MIME assembly has no dependencies to the others and can therefore be used as a standalone MIME parser. The Core assembly holds shared classes that are shared among the three client assemblies to remove redundancy.

Using the SMTP client

Compared to the IMAP client, the SMTP client is a trivial class. There is really only a single method of interest which is Send(...), obviously. The Core assembly contains all model implementations including the Message class. Once created, we pass an instance into the Send(...) method and we're done, it's really not that exciting.

using Crystalbyte.Equinox.Security;
using Crystalbyte.Equinox.Smtp;

var message = new Message();
// fill the message object

using(var client = new SmtpClient()) {
    // Connect
    // Login
    client.Send(message);
}

Using the POP3 client

As the SMTP protocol, the POP3 protocol is rather trivial, we've successfully implemented and tested the following commands: LIST, RSET, TOP, QUIT, DELE, RETR, STAT, USER, PASS, QUIT, NOOP, UIDL. I hope you won't be mad, but I'm going to skip a detailed POP3 presentation because the entire protocol is completely static and all methods are explained in great detail all over the web. The following code will however show that the usage is very similar to all other clients. The POP3 client utilizes the same classes as the SMTP and IMAP clients do, which makes in and output 100% compatible with the other clients. In fact, since all clients share these many types and the protocol is rather simple, it took little under five hours to implement the Pop3Client from scratch.

using Crystalbyte.Equinox.Security;
using Crystalbyte.Equinox.Pop3;

var message = new Message();
// fill the message object

using(var client = new Pop3Client()) {
    // Connect
    // Login
    {
        // get headers and the first 20 lines for the first message.
        var response = client.Top(1, 20);
    }
    {
        // fetch the entire fourth message
        var response = client.Retr(4);
    }
}

Using the IMAP client

Contrary to the SMTP and POP3 clients, the IMAP client is complex and huge. Since the IMAP protocol has been around since 1986 and little has changed, I'm not going to bore you with trivial minutia. Needless to say, all basic commands are implemented, but I won't go into details here. Again, there are plenty of pages, even articles on CodeProject, covering the basics which are similar to this library, if not identical. I know ... I know I'm skipping a lot, but do you really want me to tell you that calling Delete("Foo") actually deletes a folder with the name "Foo", c'mon? Instead, I'm going to focus on things unique to this library, which do cover the reasons I implemented this library in the first place.

Dynamic Query Generation using LINQ to IMAP

Although the client has a regular Search() and Fetch() method, I would not recommend those for anything but the most simple requests. I previously talked about addressing some of the issues I criticized earlier, one of these were inflexible or incomplete implementations of the fetch command.

To address this, I implemented a LINQ provider that enables us to fetch messages or parts of messages directly from the server. This comes with two perks ...

First, no matter how complex the query is or how many items we request, it will all be done in a single stroke using one fetch command. This saves net traffic and multiple round trips; especially on slower connections, this saves time. Most clients will require 4-6 requests for each message to get all the relevant data prior to fetching the message, we only need one ... always!

The more important part however is the fact that we don't have to parse or map any response manually anymore since this will be taken care of by the LINQ provider.

Let's take a quick look at a simple example. The following code will fetch items from all unread messages for the last week. The items we are going to fetch are:

Envelope
Uid
Flags
Size

var query = client.Messages.Where(x => x.Date > DateTime.Today.AddDays(-7) 
    && !x.Flags.HasFlag(MessageFlags.Read)).Select(x => new MyContainer
{
    Envelope = x.Envelope,
    Uid = x.Uid,
    Flags = x.Flags,
    Size = x.Size
});

If we need to change the scenario, we just change the query. We can fetch less or more without having to change or write a parser.

var query = client.Messages.Where( ... ).Select(x => x.Envelope);

var query = client.Messages.Where( ... ).Select(x => new SomeClass
    {    
        Subject = x.Subject,
        Uid = x.Uid,    
        Flags = x.Flags,
        Size = x.Size,
        Internal = x.InternalDate,
        BodyStructure = x.BodyStructure
    });

We can then resolve the query by iterating through the results.

foreach(var container in query) {
    Debug.WriteLine(container.Envelope.Subject);
}

As it is with LINQ to SQL, we don't have to worry about parsing the data that comes out of SQL Server anymore, we just map the responses into our object. Without LINQ involved, we would need to either create a different parser for each of those scenarios, or create a single parser that would be able to handle different but still only a finite amount of responses, and once we change the query, we would also be forced to change the parser.

As with many LINQ providers, there are limitations and restrictions since we have to work within the boundaries of the IMAP protocol. Multiple or nested Where/Select statements are not permitted; to be more precise, we need exactly one Where and one Select clause. With a few exceptions, none of the other extension methods like Any(), Single(), or SelectMany() are supported.

Fetching Old School

Although I mentioned several times that simply fetching everything may not always work in the best interest of the user, it is still possible.

var message = client.FetchMessageByUid(187);
var message = client.FetchMessageBySequenceNumber(10);

Fetching New School

So, if all is bad, what is good? Well, I endorse the 'fetch what needs to be fetched and omit the rest' principle. If the user does not want to see HTML content, just download the plain text, why have both? If Peter does not want to see grandma's holiday pictures, which come with professionally uncompressed 30 files each 1 MB, don't download them. It's simply a matter of user experience. IMAP exposes methods that enable you to do just that, they weren't included in the original drafts, but today with IMAP4rev1, they are supported by almost all common servers. The item I'm talking about is the BODYSTRUCTURE command. By fetching the body structure of a message, we receive a structural object model of all entities contained within the message.

var structure = client.Messages.Where(...).Select(x => x.BodyStructure).ToList().First();

The body structure contains info objects for all types of interest, these are attachments, views, and nested messages. Using these info objects, we can display the structure of the message to the user, since all important data about the content is available, i.e., file names, types, sizes, and so on... Once the user has selected an item to open, save, or whatever, we can then fetch this item individually by calling the appropriate method on the client and passing the info object as argument. We can even filter the info collections and perform queries only on very specific items, for instance, load only images and leave PDFs on the server, just as an example ...

var bodyStructure = ...

// fetching only the html view
var htmlViewInfo = bodyStructure.Views.Where(x => x.MediaType == "text/html");
var htmlView = client.FetchView(htmlViewInfo);

// fetching the third attachment
var thirdAttachmentInfo = bodyStructure.Attachments.ElementAt(2);
var thirdAttachment = client.FetchAttachment(thirdAttachmentInfo);

// fetching only images
var imageInfos = bodyStructure.Attachments.Where(x => x.MediaType.StartsWith("image"));
var images = imageInfos.Select(client.FetchAttachment).ToList();

// fetching only images with a size smaller than 100k (encoded)
var imageInfos = bodyStructure.Attachments.Where(x => 
    x.MediaType.StartsWith("image") && x.SizeEncoded.MegaBytes < 0.1);
var images = imageInfos.Select(client.FetchAttachment).ToList();

Similar methods are available for nested messages and views. You see, it's not that hard if the library supports it ;)

If we take a look on the inside of one of these fetch methods, we will see that they utilize the integrated LINQ provider. In fact, every fetch operation implemented in the IMAP client uses the LINQ provider. Apart from the LINQ parser itself, I didn't have to write a single parser anywhere else in this library.

public Message FetchMessageBySequenceNumber(int sn)
{
    var query = Messages
        .Where(x => x.SequenceNumber == sn)
        .Select(x => new MessageContainer
                         {
                             Uid = x.Uid,
                             SequenceNumber = x.SequenceNumber,
                             Text = (string) x.Parts(string.Empty)
                         });

    var container = query.ToList().FirstOrDefault();
    if (container == null) {
        return null;
    }
    
    var entity = new Entity();
    entity.Deserialize(container.Text);

    var message = entity.ToMessage();
    message.Uid = container.Uid;
    message.SequenceNumber = container.SequenceNumber;
    return message;
}

As you can see, there is no black magic involved, the LINQ provider can be used to fetch literally any part of the message and the key is the x.Parts() method. The method takes the structural MIME identifier as argument. The first nested entity has the identifier "1", the second "2". If the first entity has two own children, we can access them with the IDs "1.1" and "1.2", and so on, it's trivial. Above, we pass in string.Empty, which essentially means gimme' all and that's it.

Granular Searching

Very special, but rarely used for filtering are the search keys TEXT, KEYWORDS, HEADERS, FROM/TO/BCC/CC. You probably won't use them, but since they are defined within the IMAP protocol, they too have been implemented. These keys can be used to create even more granular queries.

Keywords

Keywords are similar to flags, they can be applied to messages using the STORE command. Using the Keywords property, we can search for messages with special keywords attached. The difference between regular flags and keywords is the fact that the user has the opportunity to tag messages with arbitrary values if the server allows it. The following code shows how to access messages tagged with the keyword “MyTag”.

var query = client.Messages(x => x.Keywords.Contains("MyTag")).Select(x => x.Envelope)

Headers

The Headers property can be used to query for specific headers and its values. The following query will return all messages that have a header with the name “Priority” and a corresponding value of “high”.

var query = client.Messages.Where(x => x.Headers.Any(y => 
  y.Name.Contains("Priority") && y.Value.Contains("high"))).Select(x => x.Envelope);

From, To, Bcc, Cc

All four collection properties can be used to perform a full text search on the message's contact lists. The following query will return all messages that have been sent from “Peter” to “Mary”.

var query = client.Messages.Where(x => 
   x.From.Contains("Peter") && x.To.Contains("Mary")).Select(x => x.Envelope);

Text

Using the text property, we can perform a full text search on the message’s body. The following query will return all messages that contain the string “blue whale” somewhere in their content.

var query = client.Messages.Where(x => x.Text.Contains("blue whale")).Select(x => x.Envelope);

Conclusion

Well, this is it for now. I hope you can spare some time and leave a comment. Finally, I wanted to thank all who helped us by pointing out bugs and suggesting improvements.