Data Compression


Criteria

Survey Formats

Basics

Compression Methods

Data Formats


Arithmetic Coding (AC)

Principle of the AC

Dividing into Intervals

Assignment to Codes

Sub-Intervals

Codes for 2 Symbols

Scheme Intervals "aaaa"

Scheme Intervals "abcd"

General Algorithm

AC versus Huffman

Data with high Redundancy

Adaptive AC

Implementations


Glossary

Index


Download


www.BinaryEssence.com

Dividing into Intervals


On the basis of a well-known alphabet the probability of all symbols has to be determined and converted into intervals. The size of the interval depends linearly on the symbol's probability. If this is 50% for example, then the associated sub-interval covers the half of the current interval. Usually the initial interval is [0: 1) for the encoding of the first symbol:


Example sub-division of the interval [0.0; 1.0)


     Frequency
Symbol      Probability   Sub-Interval
    a    4      0.4        [0.0; 0.4)
    b    2      0.2        [0.4; 0.6)
    c    2      0.2        [0.6; 0.8)
    d    1      0.1        [0.8; 0.9)
    e    1      0.1        [0.9; 1.0)
  Sum   10      1.0        [0.0; 1.0)

Example sub-division of the interval [0.9; 1.0)


     Frequency
Symbol      Probability   Sub-Interval
    a    4      0.4        [0.90; 0.94)
    b    2      0.2        [0.94; 0.96)
    c    2      0.2        [0.96; 0.98)
    d    1      0.1        [0.98; 0.99)
    e    1      0.1        [0.99; 1.00)

Example sub-division of the interval [0.99; 1.00)


     Frequency
Symbol      Probability   Sub-Interval
    a    4      0.4        [0.990; 0.994)
    b    2      0.2        [0.994; 0.996)
    c    2      0.2        [0.996; 0.998)
    d    1      0.1        [0.998; 0.999)
    e    1      0.1        [0.999; 1.000)

Accordingly a rational number in the range of 0.999 to < 1.000 represents the string "eee".


 <   ^   > 

Principle of the Arithmetic Coding Principle of the Arithmetic Coding Assignment of Intervals to Codes