|
I have updated the article+code with a new sample which adds files to an existing zipfile.
|
|
|
|
|
Has anyone used this library to create very large zip archives?
I experienced a crash in function send_bits() in zip.cpp.
A rough call stack:
ZipAdd
TZip::Add
TZip::ideflate
deflate
flush_block
compress_block
send_bits
I believe the crash is at the macro PUTSHORT.
Thanks very much for any advice or info.
I can provide more detail upon request.
Sincerely,
Mike
|
|
|
|
|
Please do provide more info!
(1) I've updated the article+code. So first please try this new version to see if it works.
(2) Otherwise, it'd be great if you could send me a test-case which crashes. To lu.cp@wischik.com. Thanks!
|
|
|
|
|
Thanks for your generosity and the powerful Zip utility codes. I tried to use your codes and it works well in most cases.
However, when I called the ZipAdd() to zip a large file (24MB) into a USB drive, it crashed in send_bits() if the USB drive does not have enough free space. The stack sequence is exact the same as that posted by Mike above. The crash happens inside PUTSHORT when the buffer 'state.bs.out_buf' overflows.
Can you please tell me how can I fix the problem?
Stanley
|
|
|
|
|
Unfortunately, I think that this is not a good project:
1. In terms of functionality, the code indeed compresses and extracts back to the original data. However, not a single application which uses this compression format (e.g., WinZip, PowerArchiver, etc.) managed to correctly extract the archived files. (Unless I'm missing something; did anyone try this perhaps?) Thus, to the best of my understanding, this is an implementation of a proprietary compression format, which is not that interesting.
2. More fundamentally, I think the design is very bad. The author writes about his API "Most other APIs around zip/unzip are terrible. This one is best ... Most other APIs wrap things up in classes, which is ugly overkill for such a small problem and always turns out too inflexible. Mine don't." But then:
a. A compressed archive has an internal state (e.g., since it depends on the files which had already been compressed to it); it has methods which reflect and depend on its internal state. It's a classic object. In fact, there's no way to code it without making it into an object of some sort.
The author's solution indeed defines a new "class", a (platform dependant) HZIP (which is really a Windows HANDLE), and defines "methods" which operate on objects of this class, except that they are defined extraneously to the "class", with all the usual drawbacks. By this logic, OOP is redundant in every case - an argument made by C programmers and practiced by inexperienced C++ programmers.
Now in one of the posts below, the author writes that an HZIP for compressing or extracting are completely different (conceptually); of course, the compiler has no way of knowing that. Since an HZIP is just a Windows HANDLE, he *cannot* define separate "classes" (does the author know that his code will be compiled in strict mode? is this flexible?). There goes static type checking. IMHO, another blow for the so-called ugly overkill OOP techniques.
b. One of the benefits of C++ is in differentiating between errors and exceptions. The API returns error values, which complicates the interface and usage. It does *not* throw exceptions. Thus every call needs to be checked for errors - the drawbacks of which are known to anyone who has graduated programming 101.
c. As for claims of flexibility, the API uses hard-coded memory management (how about using the STL's allocator abstraction?); it hard-wires many aspects into Windows-specific functionality unnecessarily; it solves a strict subset of possible problems (i.e., file or memory targets) un-generically and inelegantly (how about policy-based design?). This last point, in particular, is a common mistake of novice designers: flexibility means coding for change, not coding all the options you happened to think about while designing. Clearly, most users will be easily able to think of sources and targets which this design can't support.
Again, one of the posts below asks about hooking in for progress notification. Although the request is for a callback, I think that the correct solution is for a virtual function in a derived class (e.g., virtual void on_progress(...)). (GUI frameworks have evolved in this direction: e.g., MFC -> QT.) Despite the "extraordinary flexibility" of the API, neither option is supported, and the second cannot be supported, since HZIP is not a class (try to explain to a Windows HANDLE that it now has a virtual function). As the author says, OOP for such a simple problem is an ugly overkill. Until you consider the alternatives.
In summary, this project either works or doesn't (personally, I think the latter), but it seems to be a classic example of bad design. To put it in smileys: although when you read the author's hype you will be , once you try to use it you will become and ; you will go and , and if you learn from this design your manager willl be and you will be .
Efrat
|
|
|
|
|
I have used this code in several programs I have written and I think it works great, easy to use for a beginner and have a small fingerprint... No problems zipping or unzipping files created with winzip or unzipping files with winzip created with my programs so I can't understand why you have problem with this? From a programmers view I can agree with you it is not the best piece of code I have seen but it does it's job well enough and I have not found any other piece of code doing it better, perhaps you could be so kind and rewrite it showing us mortals how it should be done
|
|
|
|
|
>> I have used this code in several programs I have written and I think it works >> great
Great. So what? the original zlib works great as well (and is portable).
>> No problems zipping or unzipping files created with winzip ... so I can't
>> understand why you have problem with this?
As I posted originally, I'm not sure if it works or not, but lost interest due to the design
>> I have not found any other piece of code doing it better
Well, if you think C++ is more powerful than C (and I do), then here are two:
a. ZipIOS- http://sourceforge.net/projects/zipios/
b. even better, boost IOStream library - http://home.comcast.net/~jturkanis/iostreams/libs/iostreams/doc/home.html
But even without them, compared to this project, zlib provided a reasonable C interface. This project wraps it up in non-portable, non typesafe HANDLEs.
If you think the interface of this project is better than zlib's, good for you. (In this case, here's an interesting project for you: wrap up all of the STL in a Win 32 API - make each class a HANDLE and return error codes. You could have code like this
// Previously, std::map
HANDLE h_map = ::CreateMap(...)
// Previously, insert()
HRESULT h_ret = ::InsertIntoUsedToBeStdMap(h_map, HANDLE(3),
HANDLE('b'));
You might post it in code project: Call it something like "Containers, Algorithms, and Iterator Utils -- Clean, Elegant, Win32"; start off with
something like "Most other APIs wrap things up in classes, which is ugly overkill for such a small problem and always turns out too inflexible. Mine don't. See the code snippets below.")
>> perhaps you could be so kind and rewrite it showing us mortals how it
>> should be done
Why should I? others have done it better than I would (see links above)
>> showing us mortals ... how it should be done
I understand that your opinion is that it's condencending of me to criticize a project combining the most conceited documentation and worst design I've seen. Well, that's your right. Good luck with your "Containers, Algorithms, and Iterator Utils -- Clean, Elegant, Win32" project (what, you think it's a dumb idea? do you consider yourself above other mortals?)
|
|
|
|
|
Not at all I think all criticism is good, it often point out flaws that you have missed or regarded as not important so your comments is most welcome.
From a programmers view I agree with most of your objections the code itself is probably not the most "clean, elegant, simple or portable" I have seen, but to use it in your own program is "clean, elegant and simple" Since I do not consider myself a programmer I just added ten lines of code and had all the functionality I wanted and it worked right away without me thinking of the actual code behind it... I have tried some of the other options you mention but either I am to stupid or lazy or I had to distribute dll files with my exe file (If I write one program I want to distribute one file) I newer got any of them to work as I wanted except for Lucian's piece of code that worked out of the box...
I think it is a waste of time thinking of writing portable code 99% of the things you write is GUI stuff anyway and it is not portable. My only concern is, does the person using my program have DOS, Win95 or WinXP or does he or she even know what a PC is... I think you always must se it from the users perspective, he or she does not give a dam if the code itself is "clean, elegant, simple" as long as the program is...
But since this is a programmers forum I guess you are right and I am wrong
|
|
|
|
|
I agree with some of your points, however:
A. When you see a platform dependent interface for something that should not be, it raises questions on the design, and is a good indication that the design is flawed in other directions. In this case, compression involves memory manipulation (e.g., multiplying, adding, which is part of the language), IO manipulation (which should be handled by a stream interface), memory management (which should be handled by an allocator), and error notification (which should be handled by exceptions). In short, compression does nothing platform independent, and if you see an interface where this is otherwise, than that's a "red flag". In this case, the "red flag" turned out to be something much more serious: an "encapsulation" of archives by Window HANDLEs (bad design in any case).
B. It's great that your ten lines of code work. If it would have been 10k lines of codes, do you think there is no chance you'd eventually try to inflate/deflate an inflated/deflated archive?
C. Compression is a classic example where you want a shared library. It would be great if you could distribute each application in a single executable. Your users, however, might not be happy when their disk chokes on the huge executables containing multiple instances of the same code. If you have a problem distributing dlls, there are many fine free installer applicatoins available (just google).
D. Cross platform development is unnecessary ... until it is, when it becomes a major headache. If there's a platform specific library that doesn't offer any advantages over a platform independent library (this project is a good example), then why increase your headache?
E. You write "99% of the things you write is GUI stuff". This might be true for you, but is not true in general (although I have no statistics for this). In any case, there are excellent cross platform GUI libraries that are much better designed than older platform-dependent ones (e.g., the MFC), which is not a coincidence. If you're interested, you might check out John Torjo's library, QT (not quite free for windows), or wxWindows.
F. All being said, I was mainly put off by the author's deragatory claims regarding other interfaces coupled with his awful design.
|
|
|
|
|
A. If an article is posted under "MFC/C++" with the word "Win32" in it's title I think the author has made it clear that he or she does not thinking platform independent.
B. My 10 lines of code handles both zipping and unzipping of data, so I am not sure what you mean with this comment?
C. That is true, but if you work for a big organization as I do you come across the problem that most (all) users is not allowed to install anything on their pc due to security settings. Distributing the program as a single file is the only way in my case.
D. Since the Windows platform is used by most users and in my case all, why bother with users using Atari, Amiga and other strange computer platforms?
E. Again with the user in mind you should always use the standard Windows GUI and using MFC is a simple way since it libs is already included in windows (see also point C)
F. I disagree with you about this. The code perhaps is not clean and simple but using it is and that is what most of us novices care about is. If you have a look around on other articles you will probably be shocked to se that most code is platform dependent and written using MFC and Win32...
I think platform independent code is something of the past (UNIX) it is too expensive to put the time and effort to write something that works on all operating systems when you are on a budget. Today it is “Quick and Dirty on Time within the Budget” that rules…
|
|
|
|
|
It looks like we're going off on tangents (e.g., the author's opinion on platform indpendence (about which, personally, I don't care), the best way to distribute applications, whether Unix = platform independent (incidentally, no), and GUI libraries).
To rephrase the original post in light of your comments: In my opinion, the project's design sucks:
A. It basically took a C library (the excellent zlib) with a reasonable C interface and didn't encapsulate it into classes. This project can be taught in C++ 101 lesson 2: "OOP doesn't mean placing your (crappy) code in a cpp file"
B. It made it worse by losing typesafety. The compiler won't stop you from writing
HZIP hz = CreateZip("ex1.zip",0);
UnzipItem(hz, ...
(when this doesn't make sense). As I wrote in the previous post, the fact that your 10 lines of code don't do this, doesn't mean that when your project grows, your 10,000 lines of code won't do this.
C. The more projects you have like this, the worse off your code will be. E.g., using the MFC, you can write
HANDLE h_wnd = CreateWindow(...
UnzipItem(h_wnd, ...
(please please please don't answer that you can always compile in strict mode - you can post to some newsgroup about that).
D. It transformed a very portable library into a platform dependent library. Even if you don't care about portability, when you see something as unreasonable as using HANDLEs for compression, it's not surprising to find A-C (that was my main point; otherwise, forget for the moment about portability).
That being said, if you
A. Think that OOD doesn't matter much (and therefore the author's C-like interface is OK)
B. Always write tiny applcations (to the extent that typesafety and extendibility don't matter, since it's only 10 LOC anyway)
C. Think that crappy design always saves on budget
D. Are convinced you'll never write more than Windows GUI wrappers
or, put differently, if you want to answer any criticism with the claim "but it works for my 10 LOC", then, yes, this project is good (excellent, actually)
|
|
|
|
|
I agree with you in most cases, but if you criticize a project posted under "MFC/C++" with the title including "Win32" for being “non platform independent” I think you a bit unfair
I am not a programmer I just sometiemes write small applications for a large corporation using the windows platform. I liked the code for two reasons:
1. I was easy to implement in my program and worked directly out of the box...
2. I do not have to distribute any dll files; my application is one file which does not need a installation program to work, since my users are not allowed to run any installation programs or put unsafe dll files on their pc's due to security reasons this is a must.
If you do not like the code you are entitled to criticise it of cause but it suited me like a hand in a glove and I am not to bothered with it not being portable…
|
|
|
|
|
Efrat:
The code has a few bugs (find them and fix them if you're so smart), but otherwise it works pretty well. You know, contrary to what they taught you in college (or wherever), not every piece of code that you write has to be cross-platform. This is what it is -- a small (two-file) mini-library that allows you to quickly add compression and decompression capabilities to your *Windows* application. Messing with ZLib can quickly turn into a nightmare if you are like most of us mortals: on a tight deadline. I am not saying that ZLib is worse -- it is in many respects better, probably, and like you said, it is cross-platform. But that's beside the point here.
As for the compatibility problems you reported - that can only prove that you haven't invested the 2 minutes necessary to cut and paste the author's example code into your own program and try it out. WinZip and similar apps can read the resulting archives without any problems.
Regards.
Radu
|
|
|
|
|
Your comment (which you've posted twice, incidentally) misses nearly every point I've made. I've covered all of this in a post started by egbro. If you're interested, please read it; if not, fine.
|
|
|
|
|
Efrat:
The code has a few bugs (find them and fix them if you're so smart), but otherwise it works pretty well. You know, contrary to what they taught you in college (or wherever), not every piece of code that you write has to be cross-platform. This is what it is -- a small (two-file) mini-library that allows you to quickly add compression and decompression capabilities to your *Windows* application. Messing with ZLib can quickly turn into a nightmare if you are like most of us mortals: on a tight deadline. I am not saying that ZLib is worse -- it is in many respects better, probably, and like you said, it is cross-platform. But that's beside the point here.
As for the compatibility problems you reported - that only proves that you haven't invested the 2 minutes necessary to cut and paste the author's example code into your own program and try it out. WinZip and similar apps can read the resulting archives without any problems.
Regards.
Radu
|
|
|
|
|
Hi... this looks like a late answer to
an "animated" thread...
Personally I work in C++, I love object
oriented stuff, and when it's possible
(don't know how this model is commonly named)
I like making virtual interfaces and derive
in the implementation file
struct IZip()
{
virtual Add(...)=0
virtual Remove(...)=0
virtual ...
}
extern "C" CreateZip...
Do I think Windows developers are all a bunch of
idiots? No.
You simply must not expose anything of the
underlying structure in an header file,
just functionalities.
Author's way is C-style Abstract data hiding,
my example is COM interface-like....
(well, something is missing )
Every library inside is like the more famous
gzip library or whatever, but every
library should have a wrapper more or less like
the author's one.
What? No exception? No type safety?
Exactly!
And... *not any* dependency in header file!
(The DECLARE_HANDLE stuff looks awful)
I can't image any big OS which could expose
internal structure, internal exceptions and so on... (sure there is)
what about compiling times?
And to conclude...
Are you so secure about exception handling
across DLLs?
(Hint gcc is starting to support them now)
Are you so secure that dynamic loading of
a mangled C++ function name is so easy on
every platform?
Just my opinion, though.
Cheers, Luca.
|
|
|
|
|
Luca Piccarreta wrote:
Hi... this looks like a late answer to
an "animated" thread...
Hi, thanks for your civilized reply
Personally I work in C++, I love object
oriented stuff, and when it's possible
(don't know how this model is commonly named)
Object-handle model?
Do I think Windows developers are all a bunch of
idiots? No.
Neither do I think so
You simply must not expose anything of the
underlying structure in an header file,
just functionalities.
That's not alway possible - C++ requires a class's "layout" for stack-based allocation - hence the "private" and "protected" keywords. In this case, however, I'd be inclined to agree (there is not much of a problem if archiving is done indirectly through a handle/body frame).
Author's way is C-style Abstract data hiding,
my example is COM interface-like....
(well, something is missing )
Every library inside is like the more famous
gzip library or whatever, but every
library should have a wrapper more or less like
the author's one.
What? No exception? No type safety?
Exactly!
I think exceptions and type saftey are orthogonal considerations to the handle/body idiom (you can have any combination, e.g., a typesafe handle/body frame). It seems that many people think that type safety and exceptions are great ideas. Most of the people I've come across who disagree usually write rather small projects.
And... *not any* dependency in header file!
I can't see any problems with dependencies in header files (the "guard" mechanism, i.e., #ifndef FILE_NAME_HPP #define FILE_NAME_HPP etc works fine) ...
(The DECLARE_HANDLE stuff looks awful)
but I think the DECLARE_HANDLE is atrocious in any case.
I can't image any big OS which could expose
internal structure, internal exceptions and so on... (sure there is)
what about compiling times?
And to conclude...
Are you so secure about exception handling
across DLLs?
(Hint gcc is starting to support them now)
Exceptions are a basic part of the language and a fundamental way in which to program. If, as you say DLLs don't support them (I don't have DLL writing experience), then it's far from the first language feature compilers have taken some time to catch up with. This was (and is) true for some template features as well - would you discard templates because some compilers take longer to implement them?
Are you so secure that dynamic loading of
a mangled C++ function name is so easy on
every platform?
I'm not an expert on this, but I'd say there is no problem with name mangling (do you have reason to think that there is?)
Just my opinion, though.
Cheers, Luca.
Thanks for you reply!
|
|
|
|
|
Just to wrap up...
Sotware engineering is great, exceptions are
great, type safety is great but... but?
The risk of software reusage is that your code
is so reusable that everyone tends to build
a framework, not a neat library.
So when you mix up projects, you find loads
of a vectors, and queues, and points, and rects...
And may be root exceptions...
That's way when I think of something neat,
I think of a library with a single header file
and no dependencies.
Believe me, when you work with other people, and
the whole project's code is in the range of
more than 100000 lines of code, namespace
pollution is an issue.
Best of luck.
Luca.
|
|
|
|
|
Hi again,
Luca Piccarreta wrote:
The risk of software reusage is that your code
is so reusable that everyone tends to build
a framework, not a neat library.
... if done improperly. The risk of software in general is flawed code. What you're pointing out is a subtle special case of this.
Luca Piccarreta wrote:
So when you mix up projects, you find loads
of a vectors, and queues, and points, and rects...
And may be root exceptions...
If someone wants to write a new vector class instead of the STL's, then there's a problem. Similarly, if one writes buggy code and you use it, then that's a problem as well. Again, I think what you're pointing out is just a special case.
Luca Piccarreta wrote:
That's way when I think of something neat,
I think of a library with a single header file
and no dependencies.
I think that boost::iostream is a great library (which I bring up because it does everything zip utils does). In particular, it's very neat:
a. It uses data structures from the STL
b. It uses the STL's stream mechanism (instead of zip-util's hard-wired solution). This means that if you want to zip from a socket to quantum encoder stream (which I just made up), everything will work without modifications.
c. It uses the STL's allocator philosophy (instead of zip-utils hardwired version).
d. It uses the STL's strings' char_traits (instead of hardwiring UNICODE like zip-utils).
But, it does not contain a single header, and it contains dependencies. The headers use guards; the template instantiation mechanism doesn't compile anything you don't use. That's very neat in my opinion.
Luca Piccarreta wrote:
Believe me, when you work with other people, and
the whole project's code is in the range of
more than 100000 lines of code,
I'd be glad to believe you, but I've worked in the same settings as well.
Luca Piccarreta wrote:
amespace
pollution is an issue.
Hence the namespace keyword. Returning to boost::iostream, there are three namespaces involved: std, boost, and iostream (nested inside boost). As a result, there is *no* namespace polution (except if you consider the two "std" and "boost" pollution).
Luca Piccarreta wrote:
Best of luck.
To you as well!
+ I've enjoyed reading your post.
|
|
|
|
|
This program was almost exactly what I was looking for, I looked at porting info-zip or zlib to Windows CE 2.11 and it became apparent I'd have to do essentially what the author has already done. Yes, the design is lacking, but we have source, so who cares? The author has done the hard work, and this looks a lot easier to adapt then the base code it's derived from.
Personally, I prefer error codes over exceptions. Generally speaking they are less overhead, and less trouble to catch and support, or ignore if you wish. The last thing I want is my program terminating due to an exception. But regardless, Windows CE 2.11 does not support C++ exception handling, so thank goodness the author didn't use it.
I wish the ability to process data without performing seeks was better developed, but hopefully with some tweaks I can wean those dependencies out of the code. I just want to compress and decompress from a serial port, not a stream per say.
I was able to reduce the memory foot print quite a bit by reducing the size of some of the structures.
So sure, this could be better. A lot better, but compared to what it's derived from, this is a great contribution.
Thanks!
|
|
|
|
|
Efrat, thanks for your comments. (thanks also to the other people who replied in this thread).
I have updated the article to reflect your comments, and to answer them.
|
|
|
|
|
Something wrong with the code. Here is my test.
I've created MSVC6.0 console project. Then I've
added zip.cpp and main.cpp with the following content:
<br />
#include <windows.h><br />
#include <tchar.h><br />
#include <stdio.h><br />
#include "../zip.h"<br />
<br />
int main()<br />
{ <br />
HZIP hz;<br />
ZRESULT zr;<br />
<br />
hz = CreateZip(_T("ex1.zip"),0);<br />
zr = ZipAdd(hz,_T("znsample.txt"), _T("sample.txt"));<br />
<br />
<br />
<br />
zr = ZipAdd(hz,_T("znsample.bmp"), _T("sample.bmp"));<br />
<br />
<br />
<br />
CloseZip(hz);<br />
<br />
return 0;<br />
}<br />
I've compiled the prog and copied vc60.pdb to the sample.txt and to the sample.bmp. When I try to run the prog it add only first file to the ex1.zip file. And ZipAdd gives 0x00000400 error. I've tried some other files and it looks like the ZipAdd fails when I'm trying to add several files having size >~100kb.
Is it possible to fix the problem, please?
Thanks a lot.
|
|
|
|
|
Has anyone figured out this problem yet? I have tried tracing through the zip code, but it is rather hit and miss at the moment...
Lucian are you still out there???
|
|
|
|
|
I had the same problem, and found that the window_size was not reset before 'deflating' the second file,
so I added a line to the ideflate routine.
It seems to work for me
here is the routine:
ZRESULT TZip::ideflate(TZipFileInfo *zfi)<br />
{ if (state==0) state=new TState();<br />
state->err=0;<br />
<br />
state->ds.window_size = 0;<br />
<br />
state->readfunc=sread; state->flush_outbuf=sflush;<br />
state->param=this; state->level=8; state->seekable=iseekable; state->err=NULL;<br />
state->ts.static_dtree[0].dl.len = 0;<br />
bi_init(*state,buf, sizeof(buf), TRUE);
ct_init(*state,&zfi->att);<br />
lm_init(*state,state->level, &zfi->flg);<br />
ulg sz = deflate(*state);<br />
csize=sz;<br />
ZRESULT r=ZR_OK; if (state->err!=NULL) r=ZR_FLATE;<br />
return r;<br />
}
|
|
|
|
|
Looks like it works!
Many thanks, Alvin.
|
|
|
|
|