Click here to Skip to main content
15,892,059 members
Articles / Desktop Programming / MFC

Zip Utils - Clean, Elegant, Simple, C++/Win32

Rate me:
Please Sign up or sign in to vote.
4.90/5 (237 votes)
19 Sep 2012Public Domain8 min read 4.3M   44.7K   437   639
Adding zip/unzip easily, no LIBS or DLLs, with an elegant and powerful API

zipping and unzipping in action!

Introduction

WARNING: This code has known bugs. It doesn't deal with non-ASCII filenames correctly. It doesn't deal with passwords correctly. I don't have time to fix it, unfortunately. But I've marked it so that other gold members can edit it, in case anyone wants to make the fix.

 

This source code shows how to add zip/unzip functionality to your programs. Lots of people have written their own wrappers around Zip, and indeed there are several articles on CodeProject that are based on earlier versions of my own code. How is this version different?

  • Clean packaging. There's one pair of files zip.cpp, zip.h to add to your project if you want zip. Another pair unzip.cpp, unzip.h if you want unzip (or both if you want both!). There are no additional libraries or DLLs to worry about.
  • Clean API. Most other APIs around zip/unzip are terrible. This one is the best. The API is short, clean, and in a familiar Win32 style. Most other APIs wrap things up in classes, which is ugly overkill for such a small problem and always turn out to be too inflexible. Mine doesn't. See the code snippets below.
  • Flexibility. With this code, you can unzip from a zip that's in a disk file, memory-buffer, pipe. You can unzip into a disk file, memory-buffer or pipe. The same for creating Zip files. This means that at last you don't need to write out your files to a temporary directory before using them! One noteworthy feature is that you can unzip directly from an embedded resource into a memory buffer or onto a disk file, which is great for installers. Another is the ability to create your Zip in dynamically growable memory backed by the system page file. Despite all this power, the API remains clean and simple. The power didn't come from just writing wrappers around other people's code. It came from restructuring the internals of zlib and info-zip source code. My code is unique in what it does here.
  • Encryption. This version supports password-based Zip encryption. Passwords are absent from many other Zip libraries, including gzip.
  • Unicode. This version supports Unicode filenames.
  • Windows CE. This version works as it is under Windows CE. No need to alter makefiles or #defines, or worry about compatibility of any LIB/DLL.
  • Bug fixes. This code is based on gzip 1.1.4, which fixes a security vulnerability in 1.1.3. (An earlier version of my code used 1.1.3, and has crept into other CodeProject articles...).

At its core, my code uses zlib and info-zip. See the end of the article for acknowledgements & license.

Using the Code

To add zip functionality to your code, add the file zip.cpp to your project, and #include "zip.h" to your source code.

Similarly for unzipping, add the file unzip.cpp to the project and #include "unzip.h" to your source code. Zip and unzip can co-exist happily in a single application. Or you can omit one or the other if you're trying to save space.

The following code snippets show how to use zip/unzip. They are taken from one of the demo applications included in the download. It also has project files for Visual Studio .NET and Borland C++ Builder6 and Embedded Visual C++ 3. The code snippets here use ASCII. But the functions all take arguments of type TCHAR* rather than char*, so you can use it fine under Unicode.

Example 1 - Create a Zip File from Existing Files

C#
// We place the file "simple.bmp" inside, but inside
// the zipfile it will actually be called "znsimple.bmp".
// Similarly the textfile.

HZIP hz = CreateZip("simple1.zip",0);
ZipAdd(hz,"znsimple.bmp",  "simple.bmp");
ZipAdd(hz,"znsimple.txt",  "simple.txt");
CloseZip(hz);

Example 2 - Unzip a Zip File Using the Names It Has Inside It

C#
HZIP hz = OpenZip("\\simple1.zip",0);
ZIPENTRY ze; GetZipItem(hz,-1,&ze); int numitems=ze.index;
// -1 gives overall information about the zipfile
for (int zi=0; zi<numitems; zi++)
{ ZIPENTRY ze; GetZipItem(hz,zi,&ze); // fetch individual details
  UnzipItem(hz, zi, ze.name);         // e.g. the item's name.
}
CloseZip(hz);

Example 3- Unzip from Resource Directly into Memory

This technique is useful for small games, where you want to keep all resources bundled up inside the executable, but restricting the size.

Suppose we used a .rc with 1 RCDATA "file.zip" to embed the zip file as a resource.

C#
HRSRC hrsrc = FindResource(hInstance,MAKEINTRESOURCE(1),RT_RCDATA);
HANDLE hglob = LoadResource(hInstance,hrsrc);
void *zipbuf = LockResource(hglob);
unsigned int ziplen = SizeofResource(hInstance,hrsrc);
hz = OpenZip(zipbuf, ziplen, 0);
ZIPENTRY ze; int i; FindZipItem(hz,"sample.jpg",true,&i,&ze);
// that lets us search for an item by filename.
// Now we unzip it to a membuffer.
char *ibuf = new char[ze.unc_size];
UnzipItem(hz,i, ibuf, ze.unc_size);
...
delete[] ibuf;
CloseZip(hz);
// note: no need to free resources obtained through Find/Load/LockResource

Example 4 - Unzip Chunk by Chunk to a membuffer

Normally when you call UnzipItem(...), it gives the return-code ZR_OK. But if you gave it too small a buffer so that it couldn't fit it all in, then it returns ZR_MORE.

C#
char buf[1024]; ZRESULT zr=ZR_MORE; unsigned long totsize=0;
while (zr==ZR_MORE)
{ zr = UnzipItem(hz,i, buf,1024);
  unsigned long bufsize=1024; if (zr==ZR_OK) bufsize=ze.unc_size-totsize;
  ... maybe write the buffer to a disk file here
  totsize+=bufsize;
}

Common Questions

STRICT? I think you should always compile with STRICT (in project-settings/preprocessor/defines), and full warnings turned on. Without STRICT, the HZIP handle becomes interchangeable with all other handles.

How to show a progress dialog? One of the included examples, "progress", shows how to do this.

How to add/remove files from an existing Zip file? The zip_utils currently only allows you to OpenZip() for unzipping, or CreateZip() for adding, but don't allow you to mix the two. To modify an existing zip (e.g.: adding or removing a file), you need to create a new zip and copy all the existing items from the old into the new. One of the included examples, "modify", shows how to do this. It defines two functions:

C#
ZRESULT RemoveFileFromZip(const TCHAR *zip, const TCHAR *name);
ZRESULT AddFileToZip(const TCHAR *zip, const TCHAR *name, const TCHAR *fn);
// eg. AddFileToZip("c:\\archive.zip","znsimple.txt","c:\\docs\\file.txt");
// If the zipfile already contained that thing (case-insensitive), it is removed.
// These two functions are defined in "modify.cpp"

"fatal error C1010: unexpected end of file while looking for precompiled header directive". To fix this, select zip.cpp and unzip.cpp and change Project > Settings > C++ > PrecompiledHeaders to NotUsingPrecompiledHeaders.

Discussion

Efrat says: "I think the design is very bad", and so objects when I say that my API is clean and others are not. (Actually, he says my documentation is the most conceited he's seen and my design is the worst that he's seen!) I've reproduced his comments here, with my responses, so you can make a more informed decision whether to use my library.

  • [Efrat] Better instead to use the boost IOStream library.

    [Response] I love the boost library. If people can figure out how to add it to their projects and zip/unzip with it, they should definitely use boost rather than my code. (I'm still trying to figure it out, though, and couldn't get it to compile under CE.)

  • [Efrat] A compressed archive has internal state; it's a classic object; the author's criticisms of OOP are unjustified. "OOP doesn't mean placing your code in a CPP file."

    [Response] I'm trying not to be OOP.

    1. You'll never inherit from an archive, nor invoke virtual methods from it: we only use encapsulation, not any of the other pillars of OOP. By using an opaque handle HZIP rather than a class, I indicate this clearly to the programmer. Also,
    2. C++ classes don't work cleanly across DLLs. Handles like HZIPs do.
  • [Efrat] For instance, progress-notifications should be done by virtual functions in a derived class, not by callbacks.

    [Response] To get progress, you invoke UnzipItem in a while loop, and each iteration unzips a little bit more of the file. This is clean, re-entrant, and has a simple API. I think this is an easier API than inheriting from a class. I think inheritance from library classes is bad, in general.

  • [Efrat] Compression should go in a DLL.

    [Response] I disagree. DLLs are always pain, for developers as well as users. Unzip only adds 40K in any case.

  • [Efrat] The API doesn't use the type system to differentiate between an HZIP for zipping and an HZIP for unzipping.

    [Response] This was intentional. The difference between zipping and unzipping is a current implementation drawback. I think an API should be clean, "inspirational", and you shouldn't encode current implementation limitations into the type system.

  • [Efrat] The API uses error-codes, rather than exceptions, but anyone who has graduated Programming 101 knows exceptions are better.

    [Response] I think exceptions are not welcomed anywhere nearly as widely as Efrat suggests. Also, they don't work cleanly across DLL boundaries, and they don't work on Pocket PC.

  • [Efrat] The API is inflexible; it should be coded for change, not just coded for all the options that were conceived while designing (handles, files, memory). Most users will think of sources and targets which this design can't support.

    [Response] The original Zip uses FILE*s, which are effectively the same as Windows pipes. I also provided memory-buffers which add an enormous amount of flexibility that's easy to use and requires no additional programming. For any user who needs sources and targets which can't be reached via a memory buffer, they shouldn't use these zip_utils.

  • [Efrat] The is unnecessarily Windows-specific. The original zlib works great and is portable; zip_utils offers no advantages. Compression is memory-manipulation and IO and so should not be platform-specific.

    [Response] In the olden days before STL, "cross-platform" code inevitably meant:

    1. peppered with so many #ifdefs that you couldn't read it,
    2. didn't work straight away under Windows.

    I started from an old code-base, and so Efrat's proposed bottom-up rewrite was not possible. The advantage this code offers over zlib is that it's just a single file to add to your project, it works first time under Windows, you can add it easily as a CPP module to your project (not just dll/lib), and the API is simpler.

In general, Efrat wants code to be a clean extensible framework. I don't; I want small compact code that works fine as it is. Furthermore, I think that "framework-isation" is the biggest source of bugs and code overruns in the industry.

Acknowledgements

This version of article was updated on 28th July, 2005. Many thanks to the readers at CodeProject who found bugs and contributed fixes to an earlier version. There was one terrible bug where, after a large file had been unzipped, the next one might not work. Alvin77 spotted this bug.

My work is a repackaged form of extracts from the zlib code available at www.gzip.org by Jean-Loup Gailly and Mark Adler and others. Also from the info-zip source code at www.info-zip.org. Plus a bunch of my own changes. The original source code can be found at the two mentioned websites. Also the original copyright notices and licenses can be found there, and also inside the files zip.cpp and unzip.cpp of my code. As for licensing of my own contributions, I place them into the public domain.

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication


Written By
Technical Lead
United States United States
Lucian studied theoretical computer science in Cambridge and Bologna, and then moved into the computer industry. Since 2004 he's been paid to do what he loves -- designing and implementing programming languages! The articles he writes on CodeProject are entirely his own personal hobby work, and do not represent the position or guidance of the company he works for. (He's on the VB/C# language team at Microsoft).

Comments and Discussions

 
GeneralPassword in zip files Pin
peluquero8017-Feb-05 11:35
peluquero8017-Feb-05 11:35 
GeneralRe: Password in zip files Pin
ljw10041-Aug-05 4:59
ljw10041-Aug-05 4:59 
QuestionAdding files to an existing zipfile?? Pin
Anonymous13-Jan-05 7:10
Anonymous13-Jan-05 7:10 
AnswerRe: Adding files to an existing zipfile?? Pin
ljw10041-Aug-05 5:00
ljw10041-Aug-05 5:00 
Generalcrash in zip.cpp::send_bits() Pin
mikejohnson@volcanomail.com10-Jan-05 20:02
mikejohnson@volcanomail.com10-Jan-05 20:02 
GeneralRe: crash in zip.cpp::send_bits() Pin
ljw10041-Aug-05 5:01
ljw10041-Aug-05 5:01 
GeneralRe: crash in zip.cpp::send_bits() Pin
Skywalker200825-Jan-10 4:28
Skywalker200825-Jan-10 4:28 
GeneralFatally Flawed (in My Opinion) Pin
efrat regev2-Jan-05 0:54
efrat regev2-Jan-05 0:54 
Unfortunately, I think that this is not a good project:
1. In terms of functionality, the code indeed compresses and extracts back to the original data. However, not a single application which uses this compression format (e.g., WinZip, PowerArchiver, etc.) managed to correctly extract the archived files. (Unless I'm missing something; did anyone try this perhaps?) Thus, to the best of my understanding, this is an implementation of a proprietary compression format, which is not that interesting.
2. More fundamentally, I think the design is very bad. The author writes about his API "Most other APIs around zip/unzip are terrible. This one is best ... Most other APIs wrap things up in classes, which is ugly overkill for such a small problem and always turns out too inflexible. Mine don't." Confused | :confused: But then:
a. A compressed archive has an internal state (e.g., since it depends on the files which had already been compressed to it); it has methods which reflect and depend on its internal state. It's a classic object. In fact, there's no way to code it without making it into an object of some sort.
The author's solution indeed defines a new "class", a (platform dependant) HZIP (which is really a Windows HANDLE), and defines "methods" which operate on objects of this class, except that they are defined extraneously to the "class", with all the usual drawbacks. By this logic, OOP is redundant in every case - an argument made by C programmers and practiced by inexperienced C++ programmers.
Now in one of the posts below, the author writes that an HZIP for compressing or extracting are completely different (conceptually); of course, the compiler has no way of knowing that. Since an HZIP is just a Windows HANDLE, he *cannot* define separate "classes" (does the author know that his code will be compiled in strict mode? is this flexible?). There goes static type checking. IMHO, another blow for the so-called ugly overkill OOP techniques.
b. One of the benefits of C++ is in differentiating between errors and exceptions. The API returns error values, which complicates the interface and usage. It does *not* throw exceptions. Thus every call needs to be checked for errors - the drawbacks of which are known to anyone who has graduated programming 101.
c. As for claims of flexibility, the API uses hard-coded memory management (how about using the STL's allocator abstraction?); it hard-wires many aspects into Windows-specific functionality unnecessarily; it solves a strict subset of possible problems (i.e., file or memory targets) un-generically and inelegantly (how about policy-based design?). This last point, in particular, is a common mistake of novice designers: flexibility means coding for change, not coding all the options you happened to think about while designing. Clearly, most users will be easily able to think of sources and targets which this design can't support.
Again, one of the posts below asks about hooking in for progress notification. Although the request is for a callback, I think that the correct solution is for a virtual function in a derived class (e.g., virtual void on_progress(...)). (GUI frameworks have evolved in this direction: e.g., MFC -> QT.) Despite the "extraordinary flexibility" of the API, neither option is supported, and the second cannot be supported, since HZIP is not a class (try to explain to a Windows HANDLE that it now has a virtual function). As the author says, OOP for such a simple problem is an ugly overkill. Until you consider the alternatives.

In summary, this project either works or doesn't (personally, I think the latter), but it seems to be a classic example of bad design. To put it in smileys: although when you read the author's hype you will be Smile | :) , once you try to use it you will become Hmmm | :| and Confused | :confused: ; you will go Sigh | :sigh: and Eek! | :eek: , and if you learn from this design your manager willl be Mad | :mad: and you will be Dead | X| .

Efrat
GeneralRe: Fatally Flawed (in My Opinion) Pin
Egbro24-Jan-05 2:07
Egbro24-Jan-05 2:07 
GeneralRe: Fatally Flawed (in My Opinion) Pin
efrat regev24-Jan-05 9:19
efrat regev24-Jan-05 9:19 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Egbro24-Jan-05 20:32
Egbro24-Jan-05 20:32 
GeneralRe: Fatally Flawed (in My Opinion) Pin
efrat regev26-Jan-05 11:45
efrat regev26-Jan-05 11:45 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Egbro26-Jan-05 20:31
Egbro26-Jan-05 20:31 
GeneralRe: Fatally Flawed (in My Opinion) Pin
efrat regev26-Jan-05 22:14
efrat regev26-Jan-05 22:14 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Egbro27-Jan-05 0:55
Egbro27-Jan-05 0:55 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Radu Gruian22-Feb-05 4:02
Radu Gruian22-Feb-05 4:02 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Anonymous23-Feb-05 6:22
Anonymous23-Feb-05 6:22 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Radu Gruian22-Feb-05 4:03
Radu Gruian22-Feb-05 4:03 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Luca Piccarreta7-May-05 1:55
Luca Piccarreta7-May-05 1:55 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Anonymous8-May-05 19:07
Anonymous8-May-05 19:07 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Luca Piccarreta9-May-05 0:24
Luca Piccarreta9-May-05 0:24 
GeneralRe: Fatally Flawed (in My Opinion) Pin
Anonymous10-May-05 14:04
Anonymous10-May-05 14:04 
GeneralRe: Fatally Flawed (in My Opinion) Pin
jonW_VA7-Jun-05 14:14
jonW_VA7-Jun-05 14:14 
GeneralRe: Fatally Flawed (in My Opinion) Pin
ljw10041-Aug-05 5:02
ljw10041-Aug-05 5:02 
GeneralZipAdd Problem Pin
ki11er12-Dec-04 4:39
ki11er12-Dec-04 4:39 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.