Steganography 16 - Hiding additional files in a ZIP archive

Corinna John

4.98/5 (76 votes)

Apr 16, 2006

CPOL

6 min read

505277

3517

How to zip and unzip files, avoiding the central directory.

Download source files - 191 Kb

Introduction

A Zip archive consists of local file headers, local files, and at the end of the Zip file, the central directory. When a zip application like WinZip or FilZip opens an archive, it first reads the directory. Only when you actually extract a file, it reads the offset from its directory entry, and then the local file is read and uncompressed. Something that is not listed in the central directory will not be listed in the Zip application.

Zip archives can contain lots of single files, each of them having two sizes: compressed size and uncompressed. Have you ever calculated the expected archive size from the compressed file sizes and compared it to the size of the Zip file? No? Well, that's why a few additional bytes - additional compressed text files - won't be found by chance.

This article uses code from ICSharpCode's SharpZipLib.

Zip files

This Zip file is clean, and every zipped file has an entry in the central directory:

clean zip file

Take a close look at this Zip file. Which Zip application will display the third file? The text document is hidden from the table of contents:

hacked zip file

What we need

There are only three steps on the way to partly invisible Zip archives. We need to:

read and write zip files in general,
write Zip entries without adding them to the central directory,
find those Zip entries in the archive.

Step 1: Using SharpZipLib

Step one has already been solved by ICSharpCode: I've added the SharpZipLib project to my solution. It ran perfectly fine. For each Zip entry, it generated a directory entry automatically. Zipping a list of files with SharpZipLib works like this:

private void ZipFiles(string destinationFileName, string password)
{
    FileStream outputFileStream = 
       new FileStream(destinationFileName, 
       FileMode.Create);
    ZipOutputStream zipStream = 
       new ZipOutputStream(outputFileStream);

    bool isCrypted = false;
    if (password != null && password.Length > 0)
    { //encrypt the zip file, if password is given
      zipStream.Password = password;
      isCrypted = true;
    }

    foreach(ListViewItem viewItem in lvAll.Items)
    {
      inputStream = new FileStream(viewItem.Text, FileMode.Open);
      zipEntry = new ICSharpCode.SharpZipLib.Zip.ZipEntry(
                 Path.GetFileName(viewItem.Text));

      zipEntry.IsVisible = viewItem.Checked;
      zipEntry.IsCrypted = isCrypted;
      zipEntry.CompressionMethod = CompressionMethod.Deflated;
      zipStream.PutNextEntry(zipEntry);
      CopyStream(inputStream, zipStream);
      inputStream.Close();
      zipStream.CloseEntry();
    }

    zipStream.Finish();
    zipStream.Close();
}

Step 2: Extending ZipOutputStream

That leads us directly to step two: how can we avoid the directory entry? SharpZipLib creates the directory in ZipOutputStream.Finish(). That's where we have to catch the files that should stay hidden. I've added a property to the ZipEntry class, which tells ZipOutputStream to skip it in Finish():

namespace ICSharpCode.SharpZipLib.Zip
{
    [...]
    public class ZipEntry : ICloneable
    {
         [...]

         /// <summary>
         /// Gets or sets visibility in table of contents
         /// </summary>
         /// <remarks>
         /// Added by Corinna John
         /// </remarks>
         public bool IsVisible
         {
                get { return isVisible; }
                set { isVisible = value; }
         }

         [...]
    }
}

This property has to be checked in ZipOutputStream.Finish(). Luckily (thanks a lot, ICSharpCode!), the library is GPL software, so we can change everything.

namespace ICSharpCode.SharpZipLib.Zip
{
    [...]
    public class ZipOutputStream : DeflaterOutputStream
    {
        [...]
        public override void Finish()
        {
            if (entries == null)  {
                return;
            }

            if (curEntry != null) {
                CloseEntry();
            }

            int numEntries = 0;
            int sizeEntries = 0;

            foreach (ZipEntry entry in entries)
            {
                if (entry.IsVisible) //CJ: List only visible entries
                {
                    [...]
                    // write the directory item for the zip entry
                    [...]
                }
            }
        }
    }
    [...]
}

Step 3: Extending ZipFile

With the little changes above, we are able to add files to a Zip archive and hide them from the directory. Now the real challenge begins: we have to find our files again!

SharpZipLib contains the class ZipFile for reading archives and decompressing single files. It completely relies on directory entries: GetInputStream() takes a ZipEntry or its index and reads the local file's content from the given offset. But our invisible files don't have those directory entries. To solve this, I had to add two methods and a small change to GetInputStream().

Before we can start extracting the invisible files, we have to build a complete list, which contains all zipped files, no matter whether they are in the central directory or not. I decided to use the first file from the directory as the anchor point in the archive, because every archive will contain at least one visible file (otherwise it would be too obvious that something is wrong). We will jump into the Zip archive at the beginning of the first "official" file and walk through the following local files, finding everything that's really in there. The new method HasSuccessor(ZipEntry zipEntry) finds the end of a given Zip entry and checks the stream for whatever comes after it:

 /// <summary>
 /// Checks the file stream after the given zip entry for another one.
 /// </summary>
 /// <param name="entryIndex">The index of a zip entry.</param>
 /// <returns>true: there are more entries after
 /// this one. false: this is the last entry.</returns>
 public bool HasSuccessor(ZipEntry zipEntry)
 {
       if (entries == null)
       {
           throw new InvalidOperationException("ZipFile is closed");
       }

       //beginning of the preceeding zip entry
       long startPredecessor = CheckLocalHeader(zipEntry);

       //end of the preceeding zip entry
       long endPredecessor = startPredecessor + zipEntry.CompressedSize;

       //get a stream for whatever follows the zip entry
       Stream stream = new PartialInputStream(baseStream, 
                       endPredecessor, ZipConstants.LOCHDR);

       //read what may be a local file header
       int localHeaderStart = ReadLeInt(stream);

       //is it the beginning of another local file?
       return (localHeaderStart == ZipConstants.LOCSIG);
 }

When the above method has recognized a local file header, we have to read it and look for the next header. Most of the headers will already be known from the central directory, but the really interesting ones will be new. Known entries have a property ZipFileIndex, which stores the index of the entry in the directory. If this index is -1, the file is invisible. That means, we have to read the local file headers which are not already known (invisible files), or use the given directory entry (visible files).

/// <summary>
/// Reads the ZipEntry of a file, which has no zip entry.
/// </summary>
/// <param name="entryIndex">The index of the
///         preceeding zip entry.</param>
/// <returns>
/// An input stream.
/// </returns>
/// <exception cref="InvalidOperationException">
/// The ZipFile has already been closed
/// </exception>
/// <exception cref="ICSharpCode.SharpZipLib.Zip.ZipException">
/// The compression method for the entry is unknown
/// </exception>
/// <exception cref="IndexOutOfRangeException">
/// The entry is not found in the ZipFile
/// </exception>
public ZipEntry GetAttachedEntry(ZipEntry predecessor)
{
     if (entries == null)
     {
          throw new InvalidOperationException("ZipFile is closed");
     }

     //beginning of the preceeding zip entry
     long startPredecessor = CheckLocalHeader(predecessor);

     //end of the preceeding zip entry
     long endPredecessor = startPredecessor + 
                           predecessor.CompressedSize;

     //get a stream for the undocumented local file
     Stream stream = new PartialInputStream(baseStream, 
                     endPredecessor, ZipConstants.LOCHDR);

     //read local file header

     int localHeaderStart = ReadLeInt(stream);
     if (localHeaderStart != ZipConstants.LOCSIG)
     {
          throw new InvalidOperationException("Invalid local file header");
     }

     int version = ReadLeShort(stream);
     int flags = ReadLeShort(stream);
     int method = ReadLeShort(stream);
     int dosTime = ReadLeInt(stream);
     int crc = ReadLeInt(stream);
     int compressedSize = ReadLeInt(stream);
     int uncompressedSize = ReadLeInt(stream);
     int nameLength = ReadLeShort(stream);
     int extraLength = ReadLeShort(stream);

     //get a stream only for file name
     long offset = endPredecessor + ZipConstants.LOCHDR;
     Stream fileInfoStream = new PartialInputStream(baseStream, 
                                 offset, nameLength);

     byte[] buffer = new byte[nameLength];
     fileInfoStream.Read(buffer, 0, nameLength);
     string name = ZipConstants.ConvertToString(buffer);

     int indexFromDirectoy = FindEntry(name, false);
     ZipEntry zipEntry;
     if (indexFromDirectoy < 0)
     {
         zipEntry = new ZipEntry(name, version);
         zipEntry.CompressedSize = compressedSize;
         zipEntry.CompressionMethod = (CompressionMethod)method;
         zipEntry.Crc = crc;
         zipEntry.DosTime = dosTime;
         zipEntry.Flags = flags;
         zipEntry.IsVisible = false;
         zipEntry.Offset = (int)endPredecessor;
         zipEntry.Size = uncompressedSize;
         zipEntry.IsVisible = false;
         zipEntry.ZipFileIndex = -1;
     }
     else
     {
         zipEntry = entries[indexFromDirectoy];
         zipEntry.IsVisible = true;
     }

     return zipEntry;
}

Now we have all methods we need to build a true directory of the Zip file. This code snippet from the demo application's MainForm opens a Zip file, grabs the first item from the directory, and then crawls through the file ignoring the directory and finding the following files.

private void Open()
{
   lvAll.Items.Clear();
   lvVisible.Items.Clear();

   if (txtZipFileName.Text.Length > 0)
   {
       // open zip archive
       ZipFile zipFile = new ZipFile(txtZipFileName.Text);

       // list all files
       ZipEntry zipEntry = zipFile[0];
       AddListViewItem(zipEntry, lvAll);
       int entryIndex = 0;
       while (zipFile.HasSuccessor(zipEntry))
       {
           zipEntry = zipFile.GetAttachedEntry(zipEntry);
           AddListViewItem(zipEntry, lvAll);
           entryIndex++;
       }
   }
}

Anyway, there is still something missing: we cannot extract the invisible files, despite the fact that we have the full ZipEntry. That's because ZipFile.GetInputStream() tries to get the directory index - which is of course not there. But what we actually need to get the file's content is its offset in the archive stream. We filled the Offset property of the ZipEntry while reading it, GetInputStream(ZipEntry entry) just doesn't know, yet. So, let us change that method:

public Stream GetInputStream(ZipEntry entry)
{
    if (entries == null) {
        throw new InvalidOperationException("ZipFile has closed");
    }

    /*
     * Original method
     * Replaced by Corinna John to support "invisible" entries
     *
    int index = entry.ZipFileIndex;
    if (index < 0 || index >= entries.Length || 
        entries[index].Name != entry.Name) {
        index = FindEntry(entry.Name, true);
        if (index < 0) {
                throw new IndexOutOfRangeException();
        }
    }
    return GetInputStream(index);
    */

    if (entries == null)
    {
        throw new InvalidOperationException("ZipFile is closed");
    }

    // Do not search for a ZipFileIndex. I don't know why it was originally
    // implemented that way, but we know the data offset and indices are not
    // necessary. There are no indices for the invisible files.
    long start = CheckLocalHeader(entry);

    // Copied from GetInputStream(int entryIndex)

    CompressionMethod method = entry.CompressionMethod;
    Stream istr = new PartialInputStream(baseStream, 
                  start, entry.CompressedSize);

    if (entry.IsCrypted == true)
    {
            istr = CreateAndInitDecryptionStream(istr, entry);
            if (istr == null)
            {
                throw new ZipException("Unable to decrypt this entry");
            }
    }

    switch (method)
    {
            case CompressionMethod.Stored:
                return istr;
            case CompressionMethod.Deflated:
                return new InflaterInputStream(istr, new Inflater(true));
            default:
                throw new ZipException("Unsupported" + 
                      " compression method " + method);
    }
}

We're done! Now we are able to unzip all files, including our hidden items. In the Open() example, we built a list of zipped files. Those files can be unzipped like this:

private void UnZipFiles(string destinationDirectoryName)
{
   ZipFile zipFile = new ZipFile(txtZipFileName.Text);

   if (chkDecrypt.Checked)
   { //decrypt zip file
     zipFile.Password = txtOpenPassword.Text;
   }

   foreach (ListViewItem viewItem in lvAll.SelectedItems)
   {
     ICSharpCode.SharpZipLib.Zip.ZipEntry zipEntry = 
         viewItem.Tag as ICSharpCode.SharpZipLib.Zip.ZipEntry;
     if (zipEntry != null)
     {
         Stream inputStream = zipFile.GetInputStream(zipEntry);
         FileStream fileStream = new FileStream(
                 Path.Combine(destinationDirectoryName, zipEntry.Name),
                 FileMode.Create);
         CopyStream(inputStream, fileStream);
         fileStream.Close();
         inputStream.Close();
     }
   }
   zipFile.Close();
}

As the above example shows, there's no difference anymore between visible and hidden Zip entries. Our adapted SharpZipLib treats both variations just fine. If the ZipEntry.IsVisible property is set to false before zipping the file, it will be hidden from the central directory, but applications that use this adapted version of SharpZipLib and HasSuccessor/GetAttachedEntry instead of the directory indexer will still be able to find and unzip them.

The demo application

The demo application can create new Zip archives or edit existing files. You can add/remove visible and invisible files, and add encryption to an existing archive. On this screenshot, an invisible file is being inserted into an existing archive with two normal files. Visible files are also listed in the right box, that's just a preview on how a common Zip application is going to display the content.

The check boxes in the left list indicate whether or not a file is in the central directory. To hide a file from common Zip applications, simply un-check it. To remove a file from the archive, mark it in the left list and press [delete]. You can extract any files, hidden or not, by marking them and clicking "Extract selected files".

add a file

"Save changes" asks for a new file name. When the user has selected a destination, the files from the Zip file and the newly added files get deflated and stored in a new archive. The new archive is opened and can be edited or encrypted.

On this screenshot, the new archive with one hidden and two visible files has just been saved and is about to be encrypted with the password "hello".

encrypt an archive

Caution: If possible, you should avoid encryption, or at least edit/save an unencrypted archive and add encryption as the very last step. Sometimes it works, sometimes you lose all hidden files except the first one. :-(

Usually, the first encryption works well, but re-saving the already encrypted archive makes the local file headers untraceable. Especially when there is more than one invisible file in the archive, only try encryption, if everything else is already saved properly.