HPI File Format

Document version 1.4

I figured some of this stuff out by disassembling WriteHPI.  All hail
Eric DeZert.

I'd also like to thank Jesse Michael for his clear and concise explanation
of the compression scheme (which I shamelessly incorporated into this document),
and Barry Pedersen for helpful comments and miscellaneous useful info.

ZLib compression and decompression by Jean-loup Gailly (compression) and 
Mark Adler (decompression). 
For more info, see the zlib Home Page at http://www.cdrom.com/pub/infozip/zlib/

The rest I figured out on my own by looking at the data and using
a bit of common sense.

Warning: This is intended for use by people that already know what
they're doing.

I'm a C programmer, so I'm doing things in C notation here, but
I'll try to explain it so that those of you that don't speak C
will be able to understand.  If you don't understand, write me
at joed@cws.org and I'll try to clear things up.

I'm also a big believer in examples, so I'll be walking you through
an HPI file as I explain.

The first part of the file is a header.  Except for the copyright
statement at the end, this is the only unencrypted portion of the 
file.  The header looks like this:

typedef struct _HPIHeader {
  long HPIMarker;		  /* 'HAPI' */
  long SaveMarker;        /* 'BANK' if saved gamed */
  long DirectorySize;     /* The size of the directory */
  long HeaderKey;         /* Decrypt key */
  long Start;             /* File offset of directory */
} HPIHeader;

Here's a hex dump of a sample header:
00000000  48 41 50 49 00 00 01 00 24 02 00 00 7D 00 00 00   HAPI....$...}...
00000010  14 00 00 00 

Taken individually:

HPIMarker

This is just a marker.  The value is always HAPI in ASCII.  In 
hex, it's 0x49504148.

SaveMarker

If it's a saved game, the value is BANK in ASCII, or 0x4B4E4142 in
hex.  Save game files are something of a special case, and I haven't
done much to try to decode those.  The value in normal HPI files is
0x00010000, but I have no idea if this means anything.  I just check
for BANK, and ignore it otherwise.

DirectorySize

This is the size of the directory contained in the HPI file.  Here,
the value is 0x224, or 548 bytes.  This includes the size of the
header.

HeaderKey

The decryption key.  Its value is 0x0000007D.  More on this later.

Start

The offset in the file where the directory starts.  I have yet to
see one that didn't start immediately after the header at offset
0x14, but you never know.

Now we know enough to read the directory.  But first, a small
implementation note.  Instead of allocating a buffer of DirectorySize
bytes and then reading the directory into it, allocate a buffer of
DirectorySize bytes, and read DirectorySize-Start bytes into the buffer
at position Start.  This is because the directory contains pointers,
but the pointers are relative to the start of the file, not the start
of the directory.  By moving the directory down Start bytes into the
buffer, we simplify the program.  If we didn't do this, we'd have to
subtract Start from every offset, and that would be a royal pain.

Now some of you are undoubtedly looking at an HPI file with a hex dump
program, and saying "That sure doesn't look like a directory to me!"
Well, you're right.  That's because it's encrypted.

To decrypt it, first calculate the decryption key from the HeaderKey
variable:

Key = NOT ((HeaderKey * 4) OR (HeaderKey >> 6))

Doing this on the 0x0000007D, you get FFFFFE0A (I think).

Here is the C code for the decryption routine.  Since everything in the
file is encrypted, I found it easier to combine the read and decryption
functions into one.

int ReadAndDecrypt(int fpos, char *buff, int buffsize)
/*
Read "buffsize" bytes from the HPI file at position "fpos"
into "buff", and then decrypt it.
*/
{
  int count;
  int tkey;
  int result;

  /* first, position the file */	
  fseek(HPIFile, fpos, SEEK_SET);

  /* read the data into buff */
  result = fread(buff, 1, buffsize, HPIFile);

  /* for each character in buff... */
  for (count = 0; count < buffsize; count++) {

    /* compute tkey = (fpos + count) XOR Key */
    tkey = (fpos + count) ^ Key;

	/* and then decode the character:  
	   buff[count] = tkey XOR (NOT buff[count]) */
    buff[count] = tkey ^ ~buff[count];
  }

  /* result is the number of bytes actually read in, 
     and should be equal to buffsize */
  return result;
}

Note that the position of the byte in the file (fpos+count) is used to
decrypt.

And here is a decoded directory to make it easy to follow.  Note that
I loaded the actual directory starting at offset 0x14, so that the
first 0x14 bytes are all zeros.  See the implementation
note above.

All numbers here are 32-bit integers, ie "longs".  

00000000  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
00000010  00 00 00 00 08 00 00 00 1C 00 00 00 64 00 00 00   ............d...
00000020  6A 00 00 00 01 97 00 00 00 A0 00 00 00 01 C6 00   j...............
00000030  00 00 CF 00 00 00 01 13 01 00 00 1D 01 00 00 01   ................
00000040  66 01 00 00 6E 01 00 00 01 94 01 00 00 9D 01 00   f...n...........
00000050  00 01 C3 01 00 00 C9 01 00 00 01 EF 01 00 00 F7   ................
00000060  01 00 00 01 61 6E 69 6D 73 00 01 00 00 00 72 00   ....anims.....r.
00000070  00 00 7B 00 00 00 8E 00 00 00 00 61 72 6D 66 6C   ..{........armfl
00000080  61 6B 5F 67 61 64 67 65 74 2E 67 61 66 00 24 02   ak_gadget.gaf.$.
00000090  00 00 D8 2D 00 00 01 64 6F 77 6E 6C 6F 61 64 00   ...-...download.
000000A0  01 00 00 00 A8 00 00 00 B1 00 00 00 BD 00 00 00   ................
000000B0  00 41 52 4D 46 4C 41 4B 2E 54 44 46 00 61 28 00   .ARMFLAK.TDF.a(.
000000C0  00 01 01 00 00 01 66 65 61 74 75 72 65 73 00 01   ......features..
000000D0  00 00 00 D7 00 00 00 E0 00 00 00 E8 00 00 00 01   ................
000000E0  63 6F 72 70 73 65 73 00 01 00 00 00 F0 00 00 00   corpses.........
000000F0  F9 00 00 00 0A 01 00 00 00 61 72 6D 66 6C 61 6B   .........armflak
00000100  5F 64 65 61 64 2E 74 64 66 00 E3 28 00 00 68 02   _dead.tdf..(..h.
00000110  00 00 01 6F 62 6A 65 63 74 73 33 64 00 02 00 00   ...objects3d....
00000120  00 25 01 00 00 37 01 00 00 43 01 00 00 00 4C 01   .%...7...C....L.
00000130  00 00 5D 01 00 00 00 61 72 6D 66 6C 61 6B 2E 33   ..]....armflak.3
00000140  64 6F 00 39 2A 00 00 7B 14 00 00 01 61 72 6D 66   do.9*..{....armf
00000150  6C 61 6B 5F 64 65 61 64 2E 33 64 6F 00 C9 34 00   lak_dead.3do..4.
00000160  00 1A 11 00 00 01 73 63 72 69 70 74 73 00 01 00   ......scripts...
00000170  00 00 76 01 00 00 7F 01 00 00 8B 01 00 00 00 41   ..v............A
00000180  52 4D 46 4C 41 4B 2E 43 4F 42 00 67 3F 00 00 E4   RMFLAK.COB.g?...
00000190  09 00 00 01 75 6E 69 74 70 69 63 73 00 01 00 00   ....unitpics....
000001A0  00 A5 01 00 00 AE 01 00 00 BA 01 00 00 00 41 52   ..............AR
000001B0  4D 46 4C 41 4B 2E 50 43 58 00 B4 42 00 00 91 25   MFLAK.PCX..B...%
000001C0  00 00 01 75 6E 69 74 73 00 01 00 00 00 D1 01 00   ...units........
000001D0  00 DA 01 00 00 E6 01 00 00 00 41 52 4D 46 4C 41   ..........ARMFLA
000001E0  4B 2E 46 42 49 00 89 63 00 00 39 05 00 00 01 77   K.FBI..c..9....w
000001F0  65 61 70 6F 6E 73 00 01 00 00 00 FF 01 00 00 08   eapons..........
00000200  02 00 00 1B 02 00 00 00 61 72 6D 66 6C 61 6B 5F   ........armflak_
00000210  77 65 61 70 6F 6E 2E 74 64 66 00 2D 67 00 00 42   weapon.tdf.-g..B
00000220  02 00 00 01 

Let's get started...

00000010  00 00 00 00 08 00 00 00 1C 00 00 00 64 00 00 00   ............d...
                      ^^^^^^^^^^^ ^^^^^^^^^^^

At offset 0x14, you see the number 0x8.  This is the number of
entries in the directory.  Grabbing the next 32-bit number at
offset 0x18, you get 0x1C.  This is the offset of a list of directory entries.
In this case, there are 8 entries in the list.  The format of an entry is:

typedef struct _HPIEntry {
  long NameOffset;      /* points to the file name */
  long DirDataOffset;   /* points to directory data */
  char Flag;            /* file flag */
} HPIEntry;

NameOffset

Pointer to the file name.  This is a 0-terminated string of varying length.

DirDataOffset

Pointer to the directory data for the file.  The actual data varies depending
on whether it's a file or a subdirectory.

Flag
If this is 1, the entry is a subdirectory.  If it's 0, it's a file.

Looking at offset 0x1C, we see:

00000010                                      64 00 00 00   ............d...
00000020  6A 00 00 00 01 97 00 00 00 A0 00 00 00 01 C6 00   j...............
00000030  00 00 CF 00 00 00 01 13 01 00 00 1D 01 00 00 01   ................
00000040  66 01 00 00 6E 01 00 00 01 94 01 00 00 9D 01 00   f...n...........
00000050  00 01 C3 01 00 00 C9 01 00 00 01 EF 01 00 00 F7   ................
00000060  01 00 00 01

The 8 entries are:
0x064, 0x06A, 1
0x097, 0x0A0, 1
0x0C6, 0x0CF, 1
0x113, 0x11D, 1
0x166, 0x16E, 1
0x194, 0x19D, 1
0x1C3, 0x1C9, 1
0x1EF, 0x1F7, 1

Let's look at the first entry.  The Flag is 1, so it's a 
subdirectory.  At offset 0x64, we see:

00000060  01 00 00 01 61 6E 69 6D 73 00 01 00 00 00 72 00   ....anims.....r.
                      ^^^^^^^^^^^^^^^^^
or 'anims'.  This is the name.  Since this is a subdirectory,
offset 0x6A contains the number of entries in the subdirectory,
followed by a pointer to the first entry.  This is exactly like
the count/pointer pair at 0x14 that got us started.  Think recursion.

00000060  01 00 00 01 61 6E 69 6D 73 00 01 00 00 00 72 00   ....anims.....r.
                                        ^^^^^^^^^^^ ^^^^^ 
00000070  00 00 7B 00 00 00 8E 00 00 00 00 61 72 6D 66 6C   ..{........armfl
          ^^^^^

The number at offset 0x6A is a 1, indicating that there's only 1
file in this subdirectory. 0x6E contains the offset of the first
(and only) entry in the subdirectory, which is:

0x7B, 0x8E, 0

The 0 indicates that this is a file.  Looking at offset 0x7B, we see:

00000070  00 00 7B 00 00 00 8E 00 00 00 00 61 72 6D 66 6C   ..{........armfl
                                           ^^^^^^^^^^^^^^
00000080  61 6B 5F 67 61 64 67 65 74 2E 67 61 66 00 24 02   ak_gadget.gaf.$.
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

or 'armflak_gadget.gaf'.  This is the name of the first (and only) file
in the 'anims' subdirectory.  Since this is a file, the data at offset
0x8E is a little different.					   

There are 3 items here instead if one:

typedef struct _HPIFileData {
  long DataOffset;      /* starting offset of the file */
  long FileSize;        /* size of the decompressed file */
  char Flag;            /* file flag */
} HPIEntry;

DataOffset

This is the offset in the HPI file that this file starts at.

FileSize

This is the decompressed file size.  When you extract the file, it
should be this many bytes long.

Flag

If this is 1, the file is compressed with LZ77 compression.
If it's 2, it's compressed with ZLIb compression.
If it's 0, it's not compressed at all.  This is the format used by the
unit viewer.

00000080  61 6B 5F 67 61 64 67 65 74 2E 67 61 66 00 24 02   ak_gadget.gaf.$.
                                                    ^^^^^
00000090  00 00 D8 2D 00 00 01 64 6F 77 6E 6C 6F 61 64 00   ...-...download.
          ^^^^^ ^^^^^^^^^^^ ^^

Looking at offset 0x8E, we see that the three items are:

0x224, 0x2DD8, 1

If you recall, the directory size was 0x224 bytes.  This says the file
starts at the first offset after the directory, which makes sense and means
we're interpreting things correctly.  This also says that the extracted 
file should be 0x2DD8 (or 11,736) bytes long.

At this point, we know enough to actually traverse the directory tree in the
HPI file.  Here's a recursive pseudocode function to do it.  The initial call
to it would be 'TraverseTree(".", Header.Start)'.

TraverseTree(string ParentName, int offset)

Entries = Directory[offset]
EntryOffset = Directory[offset+4]

for count = 1 to Entries
   NameOffset = Directory[EntryOffset]
   DataOffset = Directory[EntryOffset+4]
   Flag = Directory[EntryOffset+8]

   Name = ParentName+"\"+Directory[NameOffset]

   print "Processing ",Name

   if Flag = 1
     TraverseTree(Name, DataOffset) <- recursion!
   else
     ProcessFile(Name, DataOffset)
   End If
   
   EntryOffset = EntryOffset + 9
Next Count

If you code this up in your language of choice and run it, it
should print something like this: (if you haven't guessed already,
the file I'm using as an example is the "Arm Flakker" unit's
aflakker.ufo file)

.\anims
.\anims\armflak_gadget.gaf
.\download
.\download\ARMFLAK.TDF
.\features
.\features\corpses
.\features\corpses\armflak_dead.tdf
.\objects3d
.\objects3d\armflak.3do
.\objects3d\armflak_dead.3do
.\scripts
.\scripts\ARMFLAK.COB
.\unitpics
.\unitpics\ARMFLAK.PCX
.\units
.\units\ARMFLAK.FBI
.\weapons
.\weapons\armflak_weapon.tdf

At this point, I urge you to go look at that directory hex dump and
traverse the thing by hand until it makes sense.

I can hear you now.  "What the heck is that 'ProcessFile' function?"
It decodes the file.  I'll explain in a bit.

But first, here's a list of the various files in this HPI file, and
their starting offsets.

If you don't understand where I got the starting offsets,
go reread the directory hex dump until you do.

anims\armflak_gadget.gaf             0x0224
download\ARMFLAK.TDF                 0x2861
features\corpses\armflak_dead.tdf    0x28E3
objects3d\armflak.3do                0x2A39
objects3d\armflak_dead.3do           0x34C9
scripts\ARMFLAK.COB                  0x3F67
unitpics\ARMFLAK.PCX                 0x42B4
units\ARMFLAK.FBI                    0x6389
weapons\armflak_weapon.tdf           0x672D

Because it's a short file, and because it decodes to readable
text, I'm going to use the ARMFLAK.TDF file as the example.

If the file was not compressed at all, then the file is just inserted
into the HPI file as one big chunk.

But if it is...

This is where the -REAL- fun begins.  I'm going to take it slow
here because I'm still half figuring it out myself (writing this
has actually made me realize some things that I hadn't before).

When the file was compressed, it was broken up into chunks of 64K (65536)
bytes each, plus one more chunk to hold anything left over.  Each chunk
was then compressed.  Note that some chunks are larger when compressed
than decompressed, which means that some compressed chunks can be larger
than 64K.

The total number of chunks in the file can be obtained by
the following formula:

chunks = Entry.FileSize / 65536
if (Entry.FileSize mod 65536) <> 0
  chunks = chunks + 1

The offset in the directory points to a list of 32-bit numbers
that are the compressed sizes of each compressed chunk of data.

Following the list of sizes are the actual compressed chunks
of data.

In this HPI file, each file has only one chunk, but the totala1.hpi file 
contains some files with a dozen or so, and the hpi files on the CDs have 
files in them with over a hundred.

Going to offset 0x2861, we read in a chunk o'data and decrypt it
to find this:

00002860     7E 00 00 00 53 51 53 48 02 01 01 6B 00 00 00   .~...SQSH...k...
00002870  01 01 00 00 FE 36 00 00 20 5B 51 49 4E 55 3C 0E   .....6.. [QINU<.
00002880  64 64 94 5D 49 5D 11 14 29 7B D5 26 18 55 6E 75   dd.]I]..){.&.Unu
00002890  64 54 34 41 79 6C 9C 71 81 83 8B 3B 59 49 CB 4D   dT4Ayl.q...;YI.M
000028A0  43 D1 42 54 9A A5 A8 AA AF B0 B8 64 61 AC 6D B0   C.BT.......da.m.
000028B0  B1 82 72 34 79 B8 B0 BD DD A3 82 81 E8 86 AC 89   ..r4y...........
000028C0  98 92 C2 CF 98 EB 9D E0 56 BF A2 6F AB 5F A8 96   ........V..o._..
000028D0  B5 C3 9F B8 EB B9 BE 7D 4F C7 5F CE 2F D1 4C D1   .......}O._./.L.
000028E0  D0 D2 90 

The decompressed file size of ARMFLAK.TDF is 257 bytes.  This
tells us that there's only one chunk.  The size of this chunk
is 0x7E bytes.  The chunk itself immediately follows.

Each chunk looks like this:

typedef struct _HPIChunk {
  long Marker;            /* always 0x48535153 (SQSH) */
  char Unknown1;
  char CompMethod;        /* 1=LZ77, 2=ZLib */
  char Encrypt;           /* is the block encrypted? */          
  long CompressedSize;    /* the length of the compressed data */
  long DecompressedSize;  /* the length of the decompressed data */
  long Checksum;          /* Checksum */
  char data[];            /* 'CompressedSize' bytes of data */
} HPIChunk;

Marker

This is the start-of-chunk marker, and is always 0x48535153 (ASCII 'SQSH').

Unknown1
I know not what this is for.  It's always 0x02.
Maybe some sort of version number?

CompMethod
This is the compression method.  It's 1 for LZ77, 2 for ZLib.

Encrypt
This tells whether the block is encrypted a second time.  See below.

CompressedSize

This is the size of the compressed data in the chunk.  0x6B bytes.

DecompressedSize

This is the size of the decompressed data in the chunk.  0x101 bytes.

Checksum

This is a checksum of the data.  It's merely the sum of all the bytes of
data (treated as unsigned numbers) added together.

data

The actual compressed data in the chunk.  CompressedSize (0x6B) bytes of it.

Let's look at the data.

00002870                          20 5B 51 49 4E 55 3C 0E   .....6.. [QINU<.
00002880  64 64 94 5D 49 5D 11 14 29 7B D5 26 18 55 6E 75   dd.]I]..){.&.Unu
00002890  64 54 34 41 79 6C 9C 71 81 83 8B 3B 59 49 CB 4D   dT4Ayl.q...;YI.M
000028A0  43 D1 42 54 9A A5 A8 AA AF B0 B8 64 61 AC 6D B0   C.BT.......da.m.
000028B0  B1 82 72 34 79 B8 B0 BD DD A3 82 81 E8 86 AC 89   ..r4y...........
000028C0  98 92 C2 CF 98 EB 9D E0 56 BF A2 6F AB 5F A8 96   ........V..o._..
000028D0  B5 C3 9F B8 EB B9 BE 7D 4F C7 5F CE 2F D1 4C D1   .......}O._./.L.
000028E0  D0 D2 90 

Doesn't look like much, does it.  That's because (YOU GUESSED IT!) it's
encrypted yet again!  Note: the checksum is calculated BEFORE this 
decryption.  

The 'Encrypt' field in the HPIChunk header is set to 1 to indicate
that this decryption needs to be done.

To decrypt, do this (more pseudocode):

for x = 0 to CompressedSize-1
  data[x] = (data[x] - x) XOR x
next x

This gives us:

00002870                          20 5B 4D 45 4E 55 30 00            [MENU0.
00002880  54 52 80 59 31 5D 0D 0A 09 7B D1 00 10 55 4E 49   TR.Y1]...{...UNI
00002890  54 22 00 3D 41 52 60 4D 41 43 4B 3B 11 01 83 01   T".=AR`MACK;....
000028A0  33 81 32 02 42 55 54 54 4F 4E B4 02 19 42 01 4E   3.2.BUTTON...B.N
000028B0  41 70 02 C2 01 46 4C 41 DD 23 02 7D E0 04 20 05   Ap...FLA.#.}.. .
000028C0  18 00 32 CF 00 D3 01 DE 56 3F 02 4F 03 5F 04 68   ..2.....V?.O._.h
000028D0  05 33 1F 06 D3 01 3E 41 8F 07 9F 08 AF 09 80 0D   .3....>A........
000028E0  00 00 4C                                          ..L
     
Woohoo!  Look!  Readable word fragments!  But remember, the chunk is still
compressed.

In this case, the block is compressed with LZ77, since CompMethod is 1.

The compression algorithm is a very basic sliding window compression scheme
from the LZ77 family using a 4095 byte history and matches from 2 to 17
bytes long.

The first byte is kind of a "tag" byte which determines if the next eight
pieces of data are literal bytes or history matches.  Starting with the
least-significant bit, this tag byte is scanned to figure out what to do.

When the current bit is a zero, the next byte of the input is transfered
directly to the output and added to end of the history buffer.

When the current bit is a one, the next two bytes taken from the input are
used as a offset/length pair.  The upper 12 bits are the offset into the
history buffer and the lower 4 bits are the length.  If the offset is
zero, the end of the input data has been reached and the decompressor
simply exits.

Since we can assume that there will be no matches with a length of zero
or only one byte, any match is a mimimum of two bytes so we just add two
to the length which extends our range from 0-15 to 2-17 bytes.

The match is then copied from the history buffer to the output and also added
onto the end of the history buffer to keep it in sync with the output.

When all eight bits of the tag byte have been used, the mask is reset and
the next tag byte is loaded.

Here is some decompress code:


int Decompress(char *out, char *in, int len)
{
/*
 Decompress buffer "in" of size "len" into buffer "out" (previously 
 allocated) returns the number of decompressed bytes.
*/

  int x;
  int outbufptr;
  int mask;
  int tag;
  int inptr;
  int outptr;
  int count;
  int done;
  char Window[4096];
  int inbufptr;

  for (x = 0; x < len; x++) {
    in[x] = (in[x] - x) ^ x;
  }

  done = FALSE;

  inptr = 0;
  outptr = 0;
  outbufptr = 1;
  mask = 1;
  tag = in[inptr++];
	
  while (!done) {
    if ((mask & tag) == 0) {
      out[outptr++] = in[inptr];
      Window[outbufptr] = in[inptr];
      outbufptr = (outbufptr + 1) & 0xFFF;
      inptr++;
    }
    else {
      count = *((unsigned short *) (in+inptr));
      inptr += 2;
      inbufptr = count >> 4;
      if (inbufptr == 0)
        return outptr;
      else {
        count = (count & 0x0f) + 2;
        if (count >= 0) {
          for (x = 0; x < count; x++) {
            out[outptr++] = Window[inbufptr];
            Window[outbufptr] = Window[inbufptr];
            inbufptr = (inbufptr + 1) & 0xFFF;
            outbufptr = (outbufptr + 1) & 0xFFF;
          }
        }
      }
    }
    mask *= 2;
    if (mask & 0x0100) {
      mask = 1;
      tag = in[inptr++];
    }
  }
  return outptr;
}

When fed the data, the routine spits out:

00000000  5B 4D 45 4E 55 45 4E 54 52 59 31 5D 0D 0A 09 7B   [MENUENTRY1]...{
00000010  0D 0A 09 55 4E 49 54 4D 45 4E 55 3D 41 52 4D 41   ...UNITMENU=ARMA
00000020  43 4B 3B 0D 0A 09 4D 45 4E 55 3D 33 3B 0D 0A 09   CK;...MENU=3;...
00000030  42 55 54 54 4F 4E 3D 33 3B 0D 0A 09 55 4E 49 54   BUTTON=3;...UNIT
00000040  4E 41 4D 45 3D 41 52 4D 46 4C 41 4B 3B 0D 0A 09   NAME=ARMFLAK;...
00000050  7D 0D 0A 0D 0A 5B 4D 45 4E 55 45 4E 54 52 59 32   }....[MENUENTRY2
00000060  5D 0D 0A 09 7B 0D 0A 09 55 4E 49 54 4D 45 4E 55   ]...{...UNITMENU
00000070  3D 41 52 4D 41 43 56 3B 0D 0A 09 4D 45 4E 55 3D   =ARMACV;...MENU=
00000080  33 3B 0D 0A 09 42 55 54 54 4F 4E 3D 33 3B 0D 0A   3;...BUTTON=3;..
00000090  09 55 4E 49 54 4E 41 4D 45 3D 41 52 4D 46 4C 41   .UNITNAME=ARMFLA
000000A0  4B 3B 0D 0A 09 7D 0D 0A 0D 0A 5B 4D 45 4E 55 45   K;...}....[MENUE
000000B0  4E 54 52 59 33 5D 0D 0A 09 7B 0D 0A 09 55 4E 49   NTRY3]...{...UNI
000000C0  54 4D 45 4E 55 3D 41 52 4D 41 43 41 3B 0D 0A 09   TMENU=ARMACA;...
000000D0  4D 45 4E 55 3D 33 3B 0D 0A 09 42 55 54 54 4F 4E   MENU=3;...BUTTON
000000E0  3D 33 3B 0D 0A 09 55 4E 49 54 4E 41 4D 45 3D 41   =3;...UNITNAME=A
000000F0  52 4D 46 4C 41 4B 3B 0D 0A 09 7D 0D 0A 0D 0A 0D   RMFLAK;...}..... 
00000100  0A                                                .               

Yay!  Clear decoded text.  Write this chunk out, and go get the next one.
When there are no more chunks, close the file, and go process the next one.

To recompress, do something like the following:

       WHILE look ahead buffer is not empty
             find a match in the window to previously output data
             IF match length > minimum match length
                     output reference pair
                     move the window match length to the right
             ELSE
                     output window first data item
                     move the window one to the right
             ENDIF
       END

If CompMethod is 2, use ZLib compression to decompress the block.  You can get the zlib
source code from the zlib home page at http://www.cdrom.com/pub/infozip/zlib/ 

From here, you've got enough data to proceed on your own.  Good luck!

Like I said, if you have any questions, let me know.

Joe D
joed@cws.org