Data Compression


Criteria

Survey Formats

Basics

Compression Methods

Shannon-Fano

Huffman

Lempel-Ziv (LZ)

LZ77

LZSS

LZ78

LZW

Arithmetic Coding

Run Length Encoding

Burrows-Wheeler (BWT)

Implementations

Data Formats


Glossary

Index


Download


www.BinaryEssence.com

Lempel-Ziv-Storer-Szymanski (LZSS)


The LZSS coding procedure was invented by James A. Storer and Thomas G. Szymanski in 1982. Theoretical and practical considerations were published in their article "Data Compression via Textual Substitution". The compression method bases on the work of Abraham Lempel and Jacob Ziv known as LZ77.


LZSS encodes data as a string consisting of the original data and pointers to a dictionary. The already coded data are serving as a dictionary like in LZ77. The access to former contents has the form [address + sequence length].


Example:

   abracadabra
   abracad(1,4)

Only sequences exceeding a certain minimum length will be coded, if the substitution offers a compression effect. That is the main difference to LZ77 that always uses the form [address + sequence length + deviating symbol], also if this enlarges the compressed data.


In pratice LZSS requires a flag or special character differentiating between uncompressed contents and pointers. This negatively influences the compression rate.


The original publication describes different coding schemes varying in its implementation. But the fundamental mechanism bases on the same principles.


 <   ^   > 

Lempel-Ziv Coding (LZ) Lempel-Ziv-77 (LZ77) Lempel-Ziv-78 (LZ78)