Click here to Skip to main content
15,867,771 members
Articles / High Performance Computing
Article

Quick Compression Utility for C# Byte Arrays

Rate me:
Please Sign up or sign in to vote.
3.27/5 (26 votes)
11 Jan 2007CPOL2 min read 106.1K   2.6K   31   20
A quick but useful utility for compression and decompression of byte arrays
Sample Image - Compression.jpg

Introduction

To improve network bandwidth utilization for a high performance application, I determined that compressing a large-ish object (6 MB) prior to transmission improved performance of the network call by roughly a factor of 10x. However, the MSDN sample code for the GZipStream class left a lot to be desired. It's arcane and poorly written and it took me far too long to understand it.

When I had finally figured out what was going on, I was left with, IMHO, a fairly useful little utility for generalized compression and decompression of byte arrays. It uses the GZipStream class that comes standard as part of the System.IO.Compression package.

My utility consists of a single class, Compressor, with two static methods, Compress() and Decompress(). Both methods take in a byte array as a parameter, and return a byte array. For Compress(), the parameter is the uncompressed byte array, and the return is the compressed byte array and vice versa for Decompress().

During compression, the compressed bytes are prepended with an Int32 header containing the number of bytes in the uncompressed byte array. This header is used during decompression to allocate the byte array to be returned by Decompress().

Using the Code

Simply convert the object (or collection of objects) you wish to compress into a byte array. I find that a bit of custom serialization using the BitConverter and/or the Buffer classes can work well for this. For classes with a fixed record size (i.e. contains value types only, and no strings), you can also dip down into the Marshal class (see example below) to convert an object into a pointer and then copy the memory pointed to into your buffer.

Once you have your byte array, simply pass it to Compressor.Compress() to get a compressed array for transmission. On the far end, simply pass the compressed byte array to Decompress() and recover the original byte array. Voila!

C#
//
// Sample Compression - how to send 100,000 stock prices across town in 1 second.
//
  public struct StockPrice
  {
    public int ID;
    public double bidPrice;
    public double askPrice;
    public double lastPrice;

    public static int sz = Marshal.SizeOf(typeof(StockPrice));
    public void CopyToBuffer(byte[] buffer, int startIndex)
    {
      IntPtr ptr = Marshal.AllocHGlobal(sz);
      Marshal.StructureToPtr(this, ptr, false);
      Marshal.Copy(ptr, buffer, startIndex, sz);
      Marshal.FreeHGlobal(ptr);
    }

    public static StockPrice CopyFromBuffer(byte[] buffer, int startIndex)
    {
      IntPtr ptr = Marshal.AllocHGlobal(sz);
      Marshal.Copy(buffer, startIndex, ptr, sz);
      StockPrice stockPrice = 
        (StockPrice)Marshal.PtrToStructure(ptr, typeof(StockPrice));
      Marshal.FreeHGlobal(ptr);
      return stockPrice;
    }
  }

  int Main()
  {
    // Assume that you are starting with a populated dictionary of StockPrice objects,
    // which is an instance of Dictionary<int, StockPrice> and is keyed by the ID field
    byte[] buffer = new byte[StockPriceDict.Count * StockPrice.sz];
    int startIndex = 0;
    foreach(StockPrice price in StockPriceDict.Values)
    {
      price.CopyToBuffer(buffer, startIndex);
      startIndex += StockPrice.sz;
    }

    byte[] gzBuffer = Compressor.Compress(buffer);

    // now uncompress the bytes and recover the original dictionary. 
    // This is *much* faster than
    // using .NET Remoting or similar techniques

    Dictionary<int, StockPrice> newStockPriceDict = new Dictionary<int, StockPrice>();
    byte[] buffer1 = Compressor.Decompress(gzBuffer);
    startIndex = 0;
    while (startIndex < buffer1.Length)
    {
      StockPrice stockPrice = StockPrice.CopyFromBuffer(buffer1, startIndex);
      newStockPriceDict[stockPrice.ID] = stockPrice;
    }
  }

Points of Interest

If there was any one thing I would improve about C# is its ability to manipulate objects as byte arrays. This aspect is absolutely critical for high performance computing and doesn't get enough respect from the C# product team. It seems like the functionality was only included for backwards compatibility with COM. However, it's probably the bit of code I rely on most when working in high-performance areas usually reserved for C++.

History

  • v1.0 - 10th January, 2007

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Web Developer
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralMy vote of 1 Pin
agent_kruger14-Aug-14 21:55
professionalagent_kruger14-Aug-14 21:55 
QuestionFunny Pin
emoulin17-Jan-14 11:54
emoulin17-Jan-14 11:54 
Hello,

Its funny. Big Grin | :-D It appears that it is useless to prepend the compressed bytes are with an Int32 header containing the number of bytes in the uncompressed byte array because the information is already available in the last 4 bytes of the compressed stream.

Blush | :O

I am using Framework 4.0.

For the little story, I am working on an uncommented code and found this article by chance: it corresponds exactly to the code I am working on. It allowed me to understand the code and to detect that the end of the compressed stream contains the same information.

Anyway, thanks' a lot because it was a precious help.
QuestionMessage Closed Pin
1-Jan-13 20:46
Kenneth Bullock1-Jan-13 20:46 
AnswerRe: I hate to break the bad news to you but someone has to. Pin
ronnotel2-Jan-13 2:39
ronnotel2-Jan-13 2:39 
Question[My vote of 2] Overly Complicated PinPopular
merlin98112-Dec-11 0:27
professionalmerlin98112-Dec-11 0:27 
GeneralCompression performance Pin
Dan_Barack9-Nov-10 22:45
Dan_Barack9-Nov-10 22:45 
Generalnot working Pin
mehrdad33310-Jan-10 6:32
mehrdad33310-Jan-10 6:32 
GeneralRe: not working Pin
Zac Howland2-Sep-10 9:01
Zac Howland2-Sep-10 9:01 
Questionwhere is the improvement? Pin
Luka1-Jul-09 10:31
Luka1-Jul-09 10:31 
GeneralBlock.copy Pin
MSaty9-Aug-08 4:59
MSaty9-Aug-08 4:59 
Questionwhat's the difference? Pin
altername5-Oct-07 19:51
altername5-Oct-07 19:51 
GeneralWorks Great! Pin
merlin98114-May-07 6:31
professionalmerlin98114-May-07 6:31 
GeneralMy Thanks. Pin
jeabud25-Apr-07 10:00
jeabud25-Apr-07 10:00 
GeneralRe: My Thanks. Pin
ronnotel25-Apr-07 10:03
ronnotel25-Apr-07 10:03 
GeneralToo Much Overhead Pin
Ri Qen-Sin11-Jan-07 6:53
Ri Qen-Sin11-Jan-07 6:53 
GeneralBefore improving C#.... Pin
Axel Rietschin11-Jan-07 6:30
professionalAxel Rietschin11-Jan-07 6:30 
GeneralRe: Before improving C#.... Pin
ronnotel11-Jan-07 6:58
ronnotel11-Jan-07 6:58 
GeneralRe: Before improving C#.... Pin
ronnotel11-Jan-07 8:01
ronnotel11-Jan-07 8:01 
GeneralRe: Before improving C#.... Pin
Axel Rietschin11-Jan-07 10:09
professionalAxel Rietschin11-Jan-07 10:09 
GeneralRe: Before improving C#.... Pin
ronnotel11-Jan-07 10:30
ronnotel11-Jan-07 10:30 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.