Data Compression


Survey Formats


Compression Methods

Data Formats

Arithmetic Coding (AC)

Principle of the AC

Dividing into Intervals

Assignment to Codes


Codes for 2 Symbols

Scheme Intervals "aaaa"

Scheme Intervals "abcd"

General Algorithm

AC versus Huffman

Data with high Redundancy

Adaptive AC





Principle of the Arithmetic Coding

The entire data set is represented by a single rational number, which is placed between 0 and 1. This range is divided into sub-intervals each representing a certain symbol. The number of sub-intervals is identical to the number of symbols in the current set of symbols and the size is proportional to their probability of appearance. For each symbol in the original data a new interval division takes place, on the basis of the last sub-interval.

Example Sub-Interval:

   Probability  Initial Intervals
a  0.5   (50%)   [0.0; 0.5)
b  0.3   (30%)   [0.5; 0.8)
c  0.2   (20%)   [0.8; 1.0)
Size Current Interval: 1.0

  Start      'b'              'a'
a [0.0; 0.5)     [0.50; 0.65) --> [0.500; 0.575)
b [0.5; 0.8) --> [0.65; 0.74)     [0.575; 0.620)
c [0.8; 1.0)     [0.74; 0.80)     [0.620; 0.650)
Size Current Interval:  0.30              0.150

Starting with the initial sub-division the symbol 'b' is coded with a number, which is greater or equal than 0.5 and less than 0.8. If the symbol 'b' would stand alone, any number from this interval could be selected.

The encoding of the following symbol bases on the current interval [0.5; 0.8), which will be divided again into sub-intervals. If the second symbol is an 'a', then any number could be coded from the interval [0.50; 0.65). With any further symbol the last current interval have to be divided again and again, the number of significant digits of the code word increases continuously.

Even if the accuracy of the number is increased permanently, it is not necessary to process the entire one. With the help of suitable algorithms parts that are no longer relevant can be eliminated.

 <   ^   > 

Arithmetic Coding Arithmetic Coding Dividing into Intervals