Encoding Algorithm

The Huffman encoding algorithm is as follows:

1.
Construct the encoding tree:
(a)
Count the frequency of each byte value in the input (note that there are 28 = 256 possible byte values).
(b)
Construct an initial binary tree for each byte value, weighted by its frequency.
(c)
Repeat until there is only a single tree remaining: combine the two trees with the smallest weights into a new tree, with weight equal to the sum of that of the two combined trees.
2.
Construct the encoding table: for each byte value in the input, determine its code by tracing from the leaf node representing that value to the root of the tree. (Note that this will give you the code backwards, and you will need to reverse it.)
3.
Write out the compressed data:
(a)
Write out the encoding tree (it is necessary to decode the file later). See below for a technique to do this.
(b)
Write out the count of characters in the input.
(c)
Write out the compressed data.


Charles Stewart
10/2/1998