Click here to Skip to main content
15,867,330 members
Articles / Programming Languages / C#

String Compression - SevenZip vs GZip

Rate me:
Please Sign up or sign in to vote.
4.80/5 (6 votes)
18 Sep 2016CPOL1 min read 21.7K   10   11   8
The alternative solution, how compress/decompress string faster and powerful

Introduction

GZipStream is a class that provides methods to Compress/Decompress streams, comes included in .NET Framework 2.0 and versions above.

SevenZip is an Assembly for Compress/Decompress files, which makes it more powerful.

Using the Code

using Encoder = SevenZip.Sdk.Compression.Lzma.Encoder;

public static string Compress(string text) {

            var watcher = new Watcher();
            watcher.Start();

            string result = null;

            using (var t = new MemoryStream(Encoding.Default.GetBytes(text)))
            using (var m = new MemoryStream()) {

                var encoder = new Encoder();

                encoder.WriteCoderProperties(m);
                m.Write(BitConverter.GetBytes(t.Length), 0, 8);
                encoder.Code(t, m, t.Length, -1, null);

                m.Position = 0;
                result = Encoding.Default.GetString(m.ToArray());

                t.Close();
                m.Close();
            }

            watcher.Stop();
            Console.WriteLine($"{watcher.TotalMilliseconds()}ms in compress 
            {text.Length} chars with SevenZip! = {result.Length} new length");

            return result;
        }

Now GZipStream:

public static string Compress(string text) {

            var watcher = new Watcher();
            watcher.Start();

            var buffer = Encoding.Default.GetBytes(text);
            var t = new MemoryStream();
            using (var m = new GZipStream(t, CompressionMode.Compress, true))
                m.Write(buffer, 0, buffer.Length);

            t.Position = 0;

            var outStream = new MemoryStream();
            var compressed = new byte[t.Length];
            t.Read(compressed, 0, compressed.Length);

            var gzBuffer = new byte[compressed.Length + 4];
            Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
            Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);

            watcher.Stop();
            var result = Convert.ToBase64String(gzBuffer);
            Console.WriteLine($"{watcher.TotalMilliseconds()}ms in compress 
            {text.Length} chars with GZipStream! = {result.Length} new length");
        }

De-Compress

using Decoder = SevenZip.Sdk.Compression.Lzma.Decoder;

public static string Decompress(string text) {

            var watcher = new Watcher();
            watcher.Start();

            string result = null;

            using (var t = new MemoryStream(Encoding.Default.GetBytes(text)))
            using (var m = new MemoryStream())
            {
                var decoder = new Decoder();

                var numArray = new byte[5];
                t.Read(numArray, 0, 5);
                var numArray1 = new byte[8];
                t.Read(numArray1, 0, 8);

                var num = BitConverter.ToInt64(numArray1, 0);

                decoder.SetDecoderProperties(numArray);
                decoder.Code(t, m, t.Length, num, null);

                m.Position = 0;
                result = new StreamReader(m).ReadToEnd();

                t.Close();
                m.Close();
            }

            watcher.Stop();
            Console.WriteLine($"{watcher.TotalMilliseconds()} ms in decompress 
            {text.Length} chars with SevenZip! = {result.Length} new length");

            return result;
        }

Results!

We are going to perform tests with different runs :)

Try On

  • OS: Windows 10 Versión 10.0.14393
  • RAM: 3GB Ddr2
  • CPU: Pentium Dual Core 2.70Ghz (2 cores)

Image 1

Try On

  • OS: Windows Server 2012 Version 6.3.9600
  • RAM: 3.75GB
  • CPU: Intel Xeon 2.30Ghz (2 cores)

Image 2

Finally, I have tested with a 150,000 characters with a real document file. The document is available for download in the resources of the article.

Code

static void Main()
        {
            //======================= 1 runs sync!

            Console.WriteLine($"{Environment.NewLine}1 runs sync!");
            var text = Generate(1000);

            var compressed = Compress(text);
            var decompressed = Decompress(compressed);
            CompressGZipStream(text);

            //====================================

            //=================== 2 runs Parallel!

            Console.WriteLine($"{Environment.NewLine}2 runs Parallel!");

            var list = new List<string>();
            for (var i = 2; i <= 6; i++)
                list.Add(Generate(15000 * i));

            Parallel.ForEach(list, new ParallelOptions 
            { MaxDegreeOfParallelism = 2 }, (x) => CompressParallel(x));

            //===================================

            //========= 1 run sync with real-text!

            string path = $"{Environment.CurrentDirectory}\\150,000.txt";
            if (!File.Exists(path))
                return;

            Console.WriteLine($"{Environment.NewLine}1 run sync with real-text!");
            text = File.ReadAllText(path);

            var compressedReal = Compress(text);
            var decompressedReal = Decompress(compressedReal);
            CompressGZipStream(text);

            //====================================

            Console.ReadLine();
        }

Parallel Method

static void CompressParallel(string text) {
            Compress(text);
            CompressGZipStream(text);
            Console.WriteLine();
        }

Watcher

public class Watcher
    {
        DateTime _start = DateTime.MinValue;
        DateTime _stop = DateTime.MinValue;

        TimeSpan Timer {
            get
            {
                return _stop.Subtract(_start);
            }
        }

        public void Start() { _start = DateTime.Now; }

        public void Stop() { _stop = DateTime.Now; }

        public double TotalMilliseconds() {
            if (_stop == DateTime.MinValue)
                return 0;
            return Timer.TotalMilliseconds;
        }
    }

Generate Random Text

static string Generate(int length) {
            var rnd = new Random();
            string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
            return new string(Enumerable.Repeat
            (chars, length).Select(x => x[rnd.Next(x.Length)]).ToArray());
        }

Conclusion

GZipStream is ultra fast to compress, almost 60% faster, but its outcome is not so convincing.

SevenZip is much more powerful when compressed, but its speed is not the best in the world.

Whether to Compress/Decompress thousands of data asynchronously, it would be useful to use GZipStream, since their algorithm does not require many resources regardless of the hardware.

SevenZip is excellent to send large data to a server, as the bytes of an image, file, document, etc. or even an Instant Messenger.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
El Salvador El Salvador
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
PraiseMy vote of 5! Pin
jediYL7-Nov-16 15:47
professionaljediYL7-Nov-16 15:47 
SuggestionAs for speed Pin
Member 287857220-Sep-16 0:12
professionalMember 287857220-Sep-16 0:12 
PraiseGood article, useful. Do you think either gzip or 7zip can do GPU compress/decompress? Pin
DanRGleason19-Sep-16 9:57
professionalDanRGleason19-Sep-16 9:57 
GeneralRe: Good article, useful. Do you think either gzip or 7zip can do GPU compress/decompress? Pin
John C. Espinal19-Sep-16 11:20
John C. Espinal19-Sep-16 11:20 
GeneralRe: Good article, useful. Do you think either gzip or 7zip can do GPU compress/decompress? Pin
irneb19-Sep-16 21:54
irneb19-Sep-16 21:54 
QuestionGood one Pin
irneb19-Sep-16 3:42
irneb19-Sep-16 3:42 
AnswerRe: Good one Pin
John C. Espinal19-Sep-16 14:13
John C. Espinal19-Sep-16 14:13 
Gzip and 7Zip are dedicated to streams, but one using the logical, converts a text or object to binary for so to compress it, and take advantage of.

Maybe if there are better methods, as. RAR, but in the end it all depends on that you are compressing; It can be a web page, or a video, random content or the installer of Windows 10.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.