Click here to Skip to main content
15,896,111 members
Articles / Programming Languages / C# 4.0

Buffer.BlockCopy Not As Fast As You Think

Rate me:
Please Sign up or sign in to vote.
2.89/5 (7 votes)
7 Aug 2017CPOL3 min read 74.1K   8   21
Buffer.BlockCopy is not as fast as you think

I recently wrote a quick test to prove something, that ended up being wrong…

Of course, this is the way of science. You form a hypothesis, then you test. The results should prove or disprove your theory.

The Hypothesis

Ok, my theory was that Buffer.BlockCopy() would outperform our current Array.Copy routines on the page level.

When we are loading records off a page, we have to extract some parts of the buffer to build up a row buffer. A typical 8Kb page may have 100 or more records packed into it, and grabbing them out can take quite a bit of time. Each row data is a compressed byte array that may include information that points to extended row data as well.

This is one of the differences between this same app written in C++ and C#. In C++, you would just take a pointer to the original buffer and pass that to the row routines, you wouldn't need a copy of the data. But we don't have pointers in C#, so a copy is made to get out just the part of the buffer needed by that row.

Since we want the fastest way possible to read this block of data into the row, we were looking at alternative ways to do it.

The Experiment 

I wrote a program that randomly grabs a length from an 8Kb block that contains 150 records. The length of the row was randomized (but the seed reset to the same value for both tests to ensure the same random numbers were used with both methods). The results were surprising.

C#
class Program
{
    public static int pagesize = 8192;
    public static int MaxLoops = 100000;

    static void Main( string[] args )
    {
        Stopwatch watch = new Stopwatch();

        // Force a GC Collect so we start clean
        GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
        System.Threading.Thread.Sleep(2000);
        watch.Start();
        DoTest(true);
        watch.Stop();
        Console.WriteLine("Buffer.BlockCopy took {0} ticks", watch.ElapsedTicks);

        watch.Reset();
        GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
        System.Threading.Thread.Sleep(2000);

        watch.Start();
        DoTest(false);
        watch.Stop();
        Console.WriteLine("Array.Copy took {0} ticks", watch.ElapsedTicks);

        Console.WriteLine("Press enter to exit");
        Console.ReadLine();
    }

    static void DoTest( bool BlockCopy )
    {
        byte[] src = new byte[pagesize];
        byte[] des = new byte[pagesize];

        // Always start with same random number, we want repeatable tests...
        Random rand = new Random(4);

        int totalmoved = 0;
        int maxRecord = pagesize / 150;

        for (int i = 0; i < MaxLoops; i++)
        {
            // Get size we are going to copy (pretend it is as big as 
            // some fraction of the page of the pagesize)
            int recordsize = rand.Next(maxRecord);

            int readlocation = rand.Next(pagesize - recordsize);

            // Write anywhere up to the point where we would                
            // step off the end to copy it here
            int writelocation = rand.Next(pagesize - recordsize);

            if (BlockCopy)
            {
                PartialBufferBlockCopy(ref src, readlocation, 
                    ref des, writelocation, recordsize);
            }
            else
            {
                PartialBufferArrayCopy(ref src, readlocation, 
                    ref des, writelocation, recordsize);
            }

            totalmoved += recordsize;
        }

        Console.WriteLine("Average block moved: {0} ", totalmoved / MaxLoops);
    }

    static void PartialBufferBlockCopy( ref byte[] Source, int SourceStart, 
        ref byte[] Destination, int DestStart, int NumberBytes )
    {
        Buffer.BlockCopy(Source, SourceStart, 
            Destination, DestStart, 
            NumberBytes);
    }

    static void PartialBufferArrayCopy( ref byte[] Source, int SourceStart, 
        ref byte[] Destination, int DestStart, int NumberBytes )
    {
        Array.Copy(Source, SourceStart, 
            Destination, DestStart, 
            NumberBytes);
    }

}

The Results

The results surprised me a bit. On my Core 2 laptop, the results are as follows. This was run under Visual Studio 2010 RC and .NET 4 command line app. I ran the same test under .NET 3.5 SP1 and got similar results.

  • Buffer.BlockCopy took 12706 ticks
  • Array.Copy took 6897 ticks

1.8 times Faster

That load represents a real world scenario for us in the database engine. I did find that Array.Copy can perform really poorly when you are not going from a byte[] to another byte[]. This is probably because the objects are being boxed and unboxed. Array.Copy can convert between types, but it is expensive to do so.

Array.Copy is Very Good

Your scenario may be different. Code it up and test it. In this case, the Array.Copy() working on byte[] arrays worked really well.

What about Unsafe Code?

I got an email asking what if we went to unsafe code and just dropped a pointer in to get the row data. Well, I have tested some parts of the code with unsafe blocks, but to really test this scenario I would have to rip apart a lot of code and change the way the rowdata is loaded, etc. I have not done that for a test because it would probably take a day or two to do the test, and we really have no serious intention to build an unsafe version. It would require us forking our own engine and maintaining two implementations, not something I want to do with the current engine.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
I hold a PhD in computer science, and have been a practicing developer since the early 90's.

I used to be the owner for VistaDB, but sold the product to another company in August 2010.

I have recently moved to Redmond and now work for Microsoft. Any posts or articles are purely my own opinions, and not the opinions of my employer.

Comments and Discussions

 
GeneralTested the loop Pin
Ameer Adel Al-Zubaidy22-Feb-16 9:14
professionalAmeer Adel Al-Zubaidy22-Feb-16 9:14 
Question[My vote of 1] pointers in c# are called ... 'int' Pin
Kris37379-Dec-14 9:10
Kris37379-Dec-14 9:10 
GeneralMy vote of 1 Pin
Anri8220-Nov-12 7:04
Anri8220-Nov-12 7:04 
GeneralMy vote of 1 Pin
DeltaEngine26-Nov-10 12:53
professionalDeltaEngine26-Nov-10 12:53 
GeneralThis does not look right. Pin
Ziad Elmalki21-Sep-10 21:20
Ziad Elmalki21-Sep-10 21:20 
GeneralRe: This does not look right. Pin
Ziad Elmalki21-Sep-10 21:22
Ziad Elmalki21-Sep-10 21:22 
GeneralMy results are different Pin
Dylan Banks16-Mar-10 15:29
Dylan Banks16-Mar-10 15:29 
GeneralRe: My results are different Pin
JasonShort16-Mar-10 15:58
JasonShort16-Mar-10 15:58 
GeneralRe: My results are different Pin
JasonShort16-Mar-10 16:12
JasonShort16-Mar-10 16:12 
GeneralWall clock vs. thread kernel time + user time Pin
User 691845415-Mar-10 3:52
User 691845415-Mar-10 3:52 
GeneralRe: Wall clock vs. thread kernel time + user time Pin
JasonShort15-Mar-10 13:00
JasonShort15-Mar-10 13:00 
GeneralRe: Wall clock vs. thread kernel time + user time Pin
Ziad Elmalki22-Sep-10 19:58
Ziad Elmalki22-Sep-10 19:58 
GeneralI get different results Pin
arcticbrew12-Mar-10 13:18
arcticbrew12-Mar-10 13:18 
GeneralRe: I get different results Pin
JasonShort15-Mar-10 12:57
JasonShort15-Mar-10 12:57 
GeneralRe: I get different results Pin
arcticbrew16-Mar-10 15:11
arcticbrew16-Mar-10 15:11 
It was not my intention to offend you. I was only pointing out that my tests of the copy functions showed they performed the same.

I have been writing navigation applications for Windows CE devices since 1998 (CE Prerelease Version 0.9). I used the platform builder to make Windows CE images for our devices. Performance testing is something we do most every day.

Besides my own devices I have several other manufacturer's devices (7) in my office. The real differences between the devices are the chips and the software drivers. Actually, in most cases the .Net runtime is executing in RAM. I know this for sure because I installed it. But, whether it runs from RAM chips or is using XIP from flash the performance tests are meaningful when assessing relative performance. None of the systems I used contained a customized CF runtime.

I did not make a generalization based on the results from one device. I ran the test on an intel dual core Windows Vista notebook, an intel dual core Windows XP notebook, an intel quadcore Windows XP desktop, an AMD dual core Windows XP desktop, and several Windows CE 4 and CE 5 devices. The performance of the three copy methods was so similar that I would not be surprised to find that they all eventually call the same method.

In the test I ran I made the byte arrays persistent so that their creation time was not part of the test. I wrote the test so the GC would not do a collection that would affect the results. I made sure the code was Jitted before I did my timings. I compiled in release mode, executed the test application outside of the IDE debug environment and wrote the results to a listbox.
GeneralRe: I get different results Pin
JasonShort16-Mar-10 16:09
JasonShort16-Mar-10 16:09 
GeneralRe: I get different results Pin
PedroMC30-Jun-10 5:39
PedroMC30-Jun-10 5:39 
QuestionBest way to get shorts/ints from byte array? Pin
supercat912-Mar-10 6:12
supercat912-Mar-10 6:12 
AnswerRe: Best way to get shorts/ints from byte array? Pin
arcticbrew12-Mar-10 13:00
arcticbrew12-Mar-10 13:00 
GeneralRe: Best way to get shorts/ints from byte array? Pin
supercat913-Mar-10 7:08
supercat913-Mar-10 7:08 
GeneralRe: Best way to get shorts/ints from byte array? Pin
arcticbrew13-Mar-10 8:16
arcticbrew13-Mar-10 8:16 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.