Introduction
A Zip archive consists of local file headers, local files, and at the end of the Zip file, the central directory. When a zip application like WinZip or FilZip opens an archive, it first reads the directory. Only when you actually extract a file, it reads the offset from its directory entry, and then the local file is read and uncompressed. Something that is not listed in the central directory will not be listed in the Zip application.
Zip archives can contain lots of single files, each of them having two sizes: compressed size and uncompressed. Have you ever calculated the expected archive size from the compressed file sizes and compared it to the size of the Zip file? No? Well, that's why a few additional bytes - additional compressed text files - won't be found by chance.
This article uses code from ICSharpCode's SharpZipLib.
Zip files
This Zip file is clean, and every zipped file has an entry in the central directory:
Take a close look at this Zip file. Which Zip application will display the third file? The text document is hidden from the table of contents:
What we need
There are only three steps on the way to partly invisible Zip archives. We need to:
- read and write zip files in general,
- write Zip entries without adding them to the central directory,
- find those Zip entries in the archive.
Step 1: Using SharpZipLib
Step one has already been solved by ICSharpCode: I've added the SharpZipLib project to my solution. It ran perfectly fine. For each Zip entry, it generated a directory entry automatically. Zipping a list of files with SharpZipLib works like this:
private void ZipFiles(string destinationFileName, string password)
{
FileStream outputFileStream =
new FileStream(destinationFileName,
FileMode.Create);
ZipOutputStream zipStream =
new ZipOutputStream(outputFileStream);
bool isCrypted = false;
if (password != null && password.Length > 0)
{
zipStream.Password = password;
isCrypted = true;
}
foreach(ListViewItem viewItem in lvAll.Items)
{
inputStream = new FileStream(viewItem.Text, FileMode.Open);
zipEntry = new ICSharpCode.SharpZipLib.Zip.ZipEntry(
Path.GetFileName(viewItem.Text));
zipEntry.IsVisible = viewItem.Checked;
zipEntry.IsCrypted = isCrypted;
zipEntry.CompressionMethod = CompressionMethod.Deflated;
zipStream.PutNextEntry(zipEntry);
CopyStream(inputStream, zipStream);
inputStream.Close();
zipStream.CloseEntry();
}
zipStream.Finish();
zipStream.Close();
}
Step 2: Extending ZipOutputStream
That leads us directly to step two: how can we avoid the directory entry? SharpZipLib creates the directory in ZipOutputStream.Finish()
. That's where we have to catch the files that should stay hidden. I've added a property to the ZipEntry
class, which tells ZipOutputStream
to skip it in Finish()
:
namespace ICSharpCode.SharpZipLib.Zip
{
[...]
public class ZipEntry : ICloneable
{
[...]
public bool IsVisible
{
get { return isVisible; }
set { isVisible = value; }
}
[...]
}
}
This property has to be checked in ZipOutputStream.Finish()
. Luckily (thanks a lot, ICSharpCode!), the library is GPL software, so we can change everything.
namespace ICSharpCode.SharpZipLib.Zip
{
[...]
public class ZipOutputStream : DeflaterOutputStream
{
[...]
public override void Finish()
{
if (entries == null) {
return;
}
if (curEntry != null) {
CloseEntry();
}
int numEntries = 0;
int sizeEntries = 0;
foreach (ZipEntry entry in entries)
{
if (entry.IsVisible)
{
[...]
[...]
}
}
}
}
[...]
}
Step 3: Extending ZipFile
With the little changes above, we are able to add files to a Zip archive and hide them from the directory. Now the real challenge begins: we have to find our files again!
SharpZipLib contains the class ZipFile
for reading archives and decompressing single files. It completely relies on directory entries: GetInputStream()
takes a ZipEntry
or its index and reads the local file's content from the given offset. But our invisible files don't have those directory entries. To solve this, I had to add two methods and a small change to GetInputStream()
.
Before we can start extracting the invisible files, we have to build a complete list, which contains all zipped files, no matter whether they are in the central directory or not. I decided to use the first file from the directory as the anchor point in the archive, because every archive will contain at least one visible file (otherwise it would be too obvious that something is wrong). We will jump into the Zip archive at the beginning of the first "official" file and walk through the following local files, finding everything that's really in there. The new method HasSuccessor(ZipEntry zipEntry)
finds the end of a given Zip entry and checks the stream for whatever comes after it:
public bool HasSuccessor(ZipEntry zipEntry)
{
if (entries == null)
{
throw new InvalidOperationException("ZipFile is closed");
}
long startPredecessor = CheckLocalHeader(zipEntry);
long endPredecessor = startPredecessor + zipEntry.CompressedSize;
Stream stream = new PartialInputStream(baseStream,
endPredecessor, ZipConstants.LOCHDR);
int localHeaderStart = ReadLeInt(stream);
return (localHeaderStart == ZipConstants.LOCSIG);
}
When the above method has recognized a local file header, we have to read it and look for the next header. Most of the headers will already be known from the central directory, but the really interesting ones will be new. Known entries have a property ZipFileIndex
, which stores the index of the entry in the directory. If this index is -1, the file is invisible. That means, we have to read the local file headers which are not already known (invisible files), or use the given directory entry (visible files).
public ZipEntry GetAttachedEntry(ZipEntry predecessor)
{
if (entries == null)
{
throw new InvalidOperationException("ZipFile is closed");
}
long startPredecessor = CheckLocalHeader(predecessor);
long endPredecessor = startPredecessor +
predecessor.CompressedSize;
Stream stream = new PartialInputStream(baseStream,
endPredecessor, ZipConstants.LOCHDR);
int localHeaderStart = ReadLeInt(stream);
if (localHeaderStart != ZipConstants.LOCSIG)
{
throw new InvalidOperationException("Invalid local file header");
}
int version = ReadLeShort(stream);
int flags = ReadLeShort(stream);
int method = ReadLeShort(stream);
int dosTime = ReadLeInt(stream);
int crc = ReadLeInt(stream);
int compressedSize = ReadLeInt(stream);
int uncompressedSize = ReadLeInt(stream);
int nameLength = ReadLeShort(stream);
int extraLength = ReadLeShort(stream);
long offset = endPredecessor + ZipConstants.LOCHDR;
Stream fileInfoStream = new PartialInputStream(baseStream,
offset, nameLength);
byte[] buffer = new byte[nameLength];
fileInfoStream.Read(buffer, 0, nameLength);
string name = ZipConstants.ConvertToString(buffer);
int indexFromDirectoy = FindEntry(name, false);
ZipEntry zipEntry;
if (indexFromDirectoy < 0)
{
zipEntry = new ZipEntry(name, version);
zipEntry.CompressedSize = compressedSize;
zipEntry.CompressionMethod = (CompressionMethod)method;
zipEntry.Crc = crc;
zipEntry.DosTime = dosTime;
zipEntry.Flags = flags;
zipEntry.IsVisible = false;
zipEntry.Offset = (int)endPredecessor;
zipEntry.Size = uncompressedSize;
zipEntry.IsVisible = false;
zipEntry.ZipFileIndex = -1;
}
else
{
zipEntry = entries[indexFromDirectoy];
zipEntry.IsVisible = true;
}
return zipEntry;
}
Now we have all methods we need to build a true directory of the Zip file. This code snippet from the demo application's MainForm
opens a Zip file, grabs the first item from the directory, and then crawls through the file ignoring the directory and finding the following files.
private void Open()
{
lvAll.Items.Clear();
lvVisible.Items.Clear();
if (txtZipFileName.Text.Length > 0)
{
ZipFile zipFile = new ZipFile(txtZipFileName.Text);
ZipEntry zipEntry = zipFile[0];
AddListViewItem(zipEntry, lvAll);
int entryIndex = 0;
while (zipFile.HasSuccessor(zipEntry))
{
zipEntry = zipFile.GetAttachedEntry(zipEntry);
AddListViewItem(zipEntry, lvAll);
entryIndex++;
}
}
}
Anyway, there is still something missing: we cannot extract the invisible files, despite the fact that we have the full ZipEntry
. That's because ZipFile.GetInputStream()
tries to get the directory index - which is of course not there. But what we actually need to get the file's content is its offset in the archive stream. We filled the Offset
property of the ZipEntry
while reading it, GetInputStream(ZipEntry entry)
just doesn't know, yet. So, let us change that method:
public Stream GetInputStream(ZipEntry entry)
{
if (entries == null) {
throw new InvalidOperationException("ZipFile has closed");
}
if (entries == null)
{
throw new InvalidOperationException("ZipFile is closed");
}
long start = CheckLocalHeader(entry);
CompressionMethod method = entry.CompressionMethod;
Stream istr = new PartialInputStream(baseStream,
start, entry.CompressedSize);
if (entry.IsCrypted == true)
{
istr = CreateAndInitDecryptionStream(istr, entry);
if (istr == null)
{
throw new ZipException("Unable to decrypt this entry");
}
}
switch (method)
{
case CompressionMethod.Stored:
return istr;
case CompressionMethod.Deflated:
return new InflaterInputStream(istr, new Inflater(true));
default:
throw new ZipException("Unsupported" +
" compression method " + method);
}
}
We're done! Now we are able to unzip all files, including our hidden items. In the Open()
example, we built a list of zipped files. Those files can be unzipped like this:
private void UnZipFiles(string destinationDirectoryName)
{
ZipFile zipFile = new ZipFile(txtZipFileName.Text);
if (chkDecrypt.Checked)
{
zipFile.Password = txtOpenPassword.Text;
}
foreach (ListViewItem viewItem in lvAll.SelectedItems)
{
ICSharpCode.SharpZipLib.Zip.ZipEntry zipEntry =
viewItem.Tag as ICSharpCode.SharpZipLib.Zip.ZipEntry;
if (zipEntry != null)
{
Stream inputStream = zipFile.GetInputStream(zipEntry);
FileStream fileStream = new FileStream(
Path.Combine(destinationDirectoryName, zipEntry.Name),
FileMode.Create);
CopyStream(inputStream, fileStream);
fileStream.Close();
inputStream.Close();
}
}
zipFile.Close();
}
As the above example shows, there's no difference anymore between visible and hidden Zip entries. Our adapted SharpZipLib treats both variations just fine. If the ZipEntry.IsVisible
property is set to false
before zipping the file, it will be hidden from the central directory, but applications that use this adapted version of SharpZipLib and HasSuccessor/GetAttachedEntry
instead of the directory indexer will still be able to find and unzip them.
The demo application
The demo application can create new Zip archives or edit existing files. You can add/remove visible and invisible files, and add encryption to an existing archive. On this screenshot, an invisible file is being inserted into an existing archive with two normal files. Visible files are also listed in the right box, that's just a preview on how a common Zip application is going to display the content.
The check boxes in the left list indicate whether or not a file is in the central directory. To hide a file from common Zip applications, simply un-check it. To remove a file from the archive, mark it in the left list and press [delete]. You can extract any files, hidden or not, by marking them and clicking "Extract selected files".
"Save changes" asks for a new file name. When the user has selected a destination, the files from the Zip file and the newly added files get deflated and stored in a new archive. The new archive is opened and can be edited or encrypted.
On this screenshot, the new archive with one hidden and two visible files has just been saved and is about to be encrypted with the password "hello".
Caution: If possible, you should avoid encryption, or at least edit/save an unencrypted archive and add encryption as the very last step. Sometimes it works, sometimes you lose all hidden files except the first one. :-(
Usually, the first encryption works well, but re-saving the already encrypted archive makes the local file headers untraceable. Especially when there is more than one invisible file in the archive, only try encryption, if everything else is already saved properly.