Data Compression


Criteria

Survey Formats

Basics

Basic Terms

Symbol

Set of Symbols

Alphabet

Code

Coding

Redundancy

Information Theory

Message

Probability

Information

Entropy

Redundancy Reduction

Irrelevance Reduction

Entropy Coding

Variable Length Codes

Code Trees

Compression Methods

Data Formats


Glossary

Index


Download


www.BinaryEssence.com

Redundancy


Any part of an entire data set that does not contribute to tranport information can be regarded as redundancy (lat. for profusion, abundance). Simple examples for redundant data are:

  • consecutively recurring characters
  • recurring sequences of characters
  • too large range of values e.g. for numerical variables
  • constant symbol length (not matching to frequency )

All of the examples mentioned above offer the opportunity to use alternative forms of representation without affecting the information. It is only required that the receiver is able to interpret this format.


Other types of redundancy can be derived from the rules for data representation. The probability for a particular character depends on its sourrounding context. If a text contains the string redundanc the following character will probably be a y or e. Pixel graphics show an equivalent dependency. The probability that adjoining pixel will have a total different colour is relatively low.


Redundant parts of contents are not necessarily useless. In case of errors it reduces the probability that large amounts of information will be lost. But common procedures are available adding a specific form of redundancy that could be used for error detection (EDC) or error correction (ECC) purposes. This improves error tolerance even better than unspecific redundancy.


 <   ^   > 

Survey Basic Terms Coding, Encoding, Decoding Information Theory