Click here to Skip to main content
15,867,292 members
Articles / Web Development / HTML

True Asynchronous I/O

Rate me:
Please Sign up or sign in to vote.
4.77/5 (26 votes)
28 Dec 2015Public Domain5 min read 45.7K   900   45   35
A lightweight, high performance, easy to use asynchronous stream copy method using IOCP with progress, throughput tracking, and no explicit thread creation.

This code was created in Visual Studio 2015 because it's what I have, but it should work with all versions of .NET going back to 2.0, and even 1.1 with a minor change or two. You will however, have to create a C# console application project yourself and add the source files.

wget sample output

Introduction

How much I/O do you do in your applications?
How much thought have you put into optimizing it?

I operate on a different wavelength when I'm writing in C++ than when I'm writing in C#. In C++, the code already requires quite a bit of forethought and so I spend more time considering optimization. Managed code, on the other hand, is primarily appealing because it's easy and maintainable, and optimization tends to fall by the wayside. Regardless of this, file and network I/O is one of those areas where it's critical regardless of programming tools. I/O operations are slow, and there's not a whole lot that tight code can do to squeeze more throughput out of an Internet connection, or even a hard drive. That doesn't mean there's nothing we can do to our applications to make them more responsive.

As a rule, if you can't make a long running task faster, make it so the application can do something else while it's performing that task. That is, make it asynchronous. This is what we'll be doing here, and we'll be doing it in a way that is both highly efficient and in the end, relatively easy to use.

Background

When most developers think of performing long running background tasks they think of threading. Threads are wonderful, but they're not free. In fact, after you have more threads than cores all you're doing is switching rapidly between the background and foreground code. You're still blocking at the driver level, for example, such as the hard disk driver or TCP/IP stack - you're just blocking on a seperate thread. Worse, the more threads you have, the more stress you put on the operating system's task scheduler. Creating threads when you don't need them impacts scalability.

Now what if I told you we can do true asynchronous operations rather than simply running synchronous operations on a background thread? By truly asynchronous, I mean down to the driver, and often hardware level - where the device itself is running the read in the background and we're getting an IRQ dispatch - a hardware interrupt - from the CPU on a read completion instead of waiting for stream.Read(...) to return and signaling a synchronization object from a seperate thread? Does that sound more efficient to you? It is.

Enter I/O completion ports or IOCP - a powerful way of doing asychronous I/O operations on a Windows machine. IOCP provides true threadless asynchronous I/O for maximum efficiency and scalability. The only time an extra thread is used is when the driver, via the OS dispatches the completion signal back the the application, and that thread is pre-allocated from a pool of several and automatically recycled. Furthermore, each one handles not just one, but potentially a massive number of operations, so there's effectively no measurable thread overhead, where normally you might see as much as a 1 to 1 thread allocation per operation ratio in a traditional multithreaded application.

I'm not sure how many of you are aware that .NET leverages IOCP behind System.IO.Stream.BeginRead and BeginWrite operations, but now you are. Indeed I've seen projects on here that use P/Invoke to call the windows IOCP API directly, which is unfortunate, since it requires an application that uses it to be trusted, and also because of additional marshalling overhead than you're likely to find in the .NET runtimes. It also makes the code difficult to maintain. In this application we'll be using BeginRead and BeginWrite to use IOCP.

Using the code

The code is pretty straightforward to use. The entire thing centers around one Stream extension method: CopyToWithProgress, which returns a StreamCopyProgress object you can use to monitor and control the copy while it completes.

Perhaps the simplest way to use it is like this:

C#
...
using System.IO;
using BadKitty.IO;
...
Stream source;
Stream destination;
...
// assume source and destination have both created somewhere above and point to their respective I/O.

using (StreamCopyProgress progress = source.CopyToWithProgress(destination, true))
{
    // do some work here - the operation is already taking place.

    // if you want to cancel it, just call progress.Cancel();

    // wait for finish - we don't have to wait, we can use a callback or even poll using the IsFinished property in a loop depending on the scenario, we're just doing it here because we can.
    progress.Wait();

    // example of polling
    // while(!progress.IsFinished)
    // {
    //     Thread.Sleep(1000); // sleep for 1 second
    //     Console.Write('.');
    // }
    // Console.WriteLine();

    // see the included source code for an example of using the callback feature.

    Console.WriteLine("Transfered " + (progress.Transferred / 1024.0).ToString() + "kb in " +
        progress.ElapsedSeconds + " seconds @ " + (progress.TransferRate / 1024.0).ToString() + "kbps");
}

Points of Interest

Astute readers will notice that much of the SteamCopyProgress class consists of private fields that are only initialized once and then accessed using property get accessors. The reason these are handled that way is because this object gets passed around through more than one synchronization context and may be called from one or more different threads. In order to reduce the need for performance sapping locks, it is necessary to reduce the amount of writing to this object to the bare minimum, and those writes are accomplished using interlocked functions. The interlocked functions are atomic writes, and are therefore thread-safe, and won't interfere with reads. This is also why the progress state is stored as an integer rather than an enumeration. It's easier and safer to use interlocked functions with intrinsic numeric types than it is with enumerations.

 

And that's it! Join me next time when we combine the concepts here with asynchronous WebRequest/WebResponse operations. This is part of a series that will culminate in the creation of an ultrafast segmented downloader with some fancypants GUI controls as well.

Why Public Domain?

There is no license for this or any of my submissions. They are public domain. I'm one of those ridiculous commies that isn't into providing intellectual creations on "license" Do what you will with this. Enjoy it. Make your life better. Make someone else's life better. Do nothing with it. Or overthrow a small country with it and install your own puppet government. Whatever floats your boat. In all seriousness, I simply want to see people learn, grow and become better at their craft. If this code will help one person do that, then it was worth the time I took to write it. I don't care about credit, or abstract ideas like intellectual property, although if you ever felt the need to donate money to me (or better yet, to the Lions of Rojava initiative) I wouldn't complain. =)

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication


Written By
Architect Bad Kitty Software
United States United States
I'm a long time systems developer, and architect. I've worked for companies from Microsoft to Plum Creek to Alcoa, in consulting, development and architecture roles.

I was an early participant in .NET development at Microsoft, and was on the Visual Studio Everett and Windows Whistler development teams.

I am primarily a C++ developer, but I do a lot of .NET C# development as well.

I have an interest in compiler tool construction that began a couple of decades ago, and have worked on and off in domain specific aspect oriented language construction with mutable grammars.

I gave it up for years, on account of being an anarchist commie and not really liking being one of the petit bourgeoisie, among other reasons. I'm still trying to find a balance between my political beliefs and my minimalist lifestyle, and the software industry. I've been writing code since about 1986 when I was little.

Comments and Discussions

 
GeneralRe: Why not Async? Pin
danah gaz30-Dec-15 14:08
professionaldanah gaz30-Dec-15 14:08 
AnswerRe: Why not Async? Pin
danah gaz29-Dec-15 12:41
professionaldanah gaz29-Dec-15 12:41 
GeneralRe: Why not Async? Pin
William E. Kempf5-Jan-16 4:50
William E. Kempf5-Jan-16 4:50 
GeneralRe: Why not Async? Pin
danah gaz5-Jan-16 13:55
professionaldanah gaz5-Jan-16 13:55 
GeneralRe: Why not Async? Pin
EnCey9-Jan-16 11:17
EnCey9-Jan-16 11:17 
GeneralRe: Why not Async? Pin
danah gaz12-Jan-16 10:58
professionaldanah gaz12-Jan-16 10:58 
QuestionBadKitty.IO Pin
knoami28-Dec-15 22:11
knoami28-Dec-15 22:11 
AnswerRe: BadKitty.IO Pin
danah gaz29-Dec-15 12:27
professionaldanah gaz29-Dec-15 12:27 
The BadKitty.IO namespace is exported by the classes in the StreamCopyToWithProgress.cs file that I included with the source.



modified 29-Dec-15 19:11pm.

GeneralRe: BadKitty.IO Pin
knoami29-Dec-15 19:36
knoami29-Dec-15 19:36 
GeneralRe: BadKitty.IO Pin
danah gaz30-Dec-15 14:12
professionaldanah gaz30-Dec-15 14:12 
QuestionGreat! Pin
Member 1168325128-Dec-15 21:18
Member 1168325128-Dec-15 21:18 
AnswerRe: Great! Pin
danah gaz29-Dec-15 12:28
professionaldanah gaz29-Dec-15 12:28 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.