Data Compression


Survey Formats


Compression Methods

Data Formats




Data Structure

Data Integrity




Audio Data

Image Data

Video Data




BZIP2: Compression

The crucial algorithm of BZIP2 is the Burrows Wheeler transformation (BWT), that converts the original data into a suitable form for following coding. The current version applies a Huffman code.

In earlier phases of the development the more efficient arithmetic coding was used. This is restricted by patents and is not well suited for open source projects. In addition BZIP2 contains options for a run length encoding (RLE), which is obsolet in the meantime.

The achievable compression rate of BZIP2 is considerably better in comparison to conventional formats e.g. Deflate or Deflate64™ (GZIP, ZIP) and slightly below the best up-to-date procedures available (PPM, LZMA).

Looking to the memory capacities nowadays available, the expenditure for the compression of less than 8 MByte is relatively small, but is clearly larger than for Deflate. The massive memory requirements of LZMA and especially PPM is not reached however.

The internal algorithm processes the data in blocks being totally independent from each other. The block sizes can be set in a range from 100,000 - 900,000 byte (1-9); standard are blocks of 900,000 symbols. Smaller block sizes reduce the memory requirements.

The processing speed is proportional to the achievable compression and to comparable procedures (slower than Deflate, faster than PPM). Decoding is however more than two times faster than encoding (especially in contrast to PPM).

 <   ^   > 

Burrows-Wheeler-Transformation (BWT) [Burrows-Wheeler-Transformation (BWT)]

Huffman Coding [Huffman Coding]

Adaptive Huffman Coding [Adaptive Huffman Coding]

Arithmetic Coding [Arithmetic Coding]

Run Length Encoding [Run Length Encoding]

GZIP [File Format: GZIP]

ZIP [File Format: ZIP]

External Links:

BinaryEssence is not responsible
for contents of external websites:

BZIP2, LIBBZIP2 (at Red Hat Sourceware) []

BZIP2 BZIP2 BZIP2: Data Structure