Data Compression


Survey Formats


Compression Methods

Data Formats

Huffman Code




Dynamic Huffman Code

Adaptive Huffman Code



Table Code Tree

Update Procedure

New Symbols

Update Code Tree






Algorithm Adaptive Huffman Code

A number of rules and conventions is required for the algorithm described hereinafter. It bases on a procedure starting with an empty tree and introducing a special control character. No preventive assumptions about the code distribution will be made.

Control character for symbols that are not part of the tree
A control character is defined in addition to the original set of symbols. This control character identifies symbols that, at run-time, do not belong to the current code tree. In the literature this control character is denoted as a NYA or NYT (Not Yet Available or Transmitted).

Weight of a node.
Any node has an attribute called the weight of the node. The weight of a leaf node is equal to the frequency with which the corresponding symbol had been coded so far. The weight of the interior nodes is equal to the sum of the two subordinated nodes. The control character NYA gets the weight 0.

Structure of the Code Table
All nodes are sorted according to their weight in ascending order. The NYA node always provides the lowest node in the hierarchy.

Sibling Property
Any pair of nodes that are neighbours in the hierarchy refer to the same parent node (node 2n-1 and 2n -> 1 & 2, 3 & 4, 5 & 6, etc.). The parent node is always ordered at a higher level in the hierarchy because its weight results from the sum of the subordinated nodes. This is denoted as sibling property.

The node with the highest weight is placed at the highest level in the hierarchy and forms the current root of the tree.

Intitial Tree
The initial tree only contains the NYA node with the weight 0.

Data format for uncoded symbols
In the simplest case symbols which are not contained in the code tree are encoded linearly (e.g. with 8 bit from a set of 256 symbols). A better compression efficiency could be achieved, if the data format will be adapted to the number of remaining symbols. A variety of options are available to optimize the code length. A suitable procedure will be presented guaranteeing full utilization of the range of values.

Maximum number of nodes
Assuming a set of n symbols the total number of nodes results in 2n-1 (leaf and interior nodes). Accordingly the maximum number of nodes is 511 if the standard unit is the byte.

All nodes of identical weight will be summarized logically to a block. Nodes which are part of a block are always neighbouring in the code table representing the code tree.

 <   ^   > 

Adaptive Huffman Code Pros and Cons Table representing the Code Tree