Data Compression


Criteria

Survey Formats

Basics

Compression Methods

Shannon-Fano

Huffman

Lempel-Ziv (LZ)

Arithmetic Coding

Run Length Encoding

Burrows-Wheeler (BWT)

Encoding

Transformed Data

Decoding

Characteristics

Applications

Implementations

Data Formats


Glossary

Index


Download


www.BinaryEssence.com

BWT: Encoding


The transformation is made by a matrix, in which a block of the original data is arranged. The first row contains the unchanged original data. Each row following represent a rotation to the left of the preceding row:


Matrix for "abracadabra":


      0 1 2 3 4 5 6 7 8 9 10

   0  a b r a c a d a b r a a
       / / / / / / / / / / /
   1  b r a c a d a b r a a
   2  r a c a d a b r a a b
   3  a c a d a b r a a b r
   4  c a d a b r a a b r a
   5  a d a b r a a b r a c
   6  d a b r a a b r a c a
   7  a b r a a b r a c a d
   8  b r a a b r a c a d a
   9  r a a b r a c a d a b
  10  a a b r a c a d a b r

Afterwards this matrix will be sorted "lexicographically", i.e. in the order commonly used by encyclopedias. The most significant symbol is represented by the first column in this example:


      0 1 2 3 4 5 6 7 8 9 10

   0  a a b r a c a d a b r
   1  a b r a a b r a c a d
   2  a b r a c a d a b r a
   3  a d a b r a a b r a c
   4  a c a d a b r a a b r
   5  b r a a b r a c a d a
   6  b r a c a d a b r a a
   7  d a b r a a b r a c a
   8  c a d a b r a a b r a
   9  r a a b r a c a d a b
  10  r a c a d a b r a a b

The last column of the matrix and information about the row originally containing source data will be stored. In this example the row 2 represents the original string "abracadabra". The Burrows-Wheeler-Transformation result in the following data stream:


    r d a c r a a a a b b 2

The BWT exists in several variants which insignificantly differ from each other but are identical from the principle. Here the algorithm is described following the original publication by Michael Burrows and David Wheeler.


 <   ^   > 

Burrows-Wheeler-Transformation (BWT) Burrows-Wheeler-Transformation (BWT) BWT: Characteristics of Transformed Data