Data Compression


Criteria

Survey Formats

Basics

Basic Terms

Symbol

Set of Symbols

Alphabet

Code

Coding

Redundancy

Information Theory

Message

Probability

Information

Entropy

Redundancy Reduction

Irrelevance Reduction

Entropy Coding

Variable Length Codes

Code Trees

Compression Methods

Data Formats


Glossary

Index


Download


www.BinaryEssence.com

Probability


Probability is a value that quantifies occurrences of a certain message. It amounts to the number of occurrences of this message related to the total number of messages.


                         Eventsm
  Probability:  P(m) = -----------
                         Events

The probability may range between 0 (the message m does not appear) and 1 (the information source only sends this message). The sum of probabilities of all messages is 1 (assuming no rounding difference).


  Range of Values:  0 <=  P(m)  <=  1

  m=max
  -----
  \
   +-   Probability P(m) = 1
  /
  -----
   m=1

If an information source sends e.g. 4 different messages equally distributed, any message has a probability of 1/4 = 0.25. Data being represented by such a uniform distribution cannot be compressed by common coding procedures like Huffman coding or arithmetic coding. In contrast strong differences in message's probablity offer huge opportunaties for data compression.


Strictly speaking this description is mathematically not correct. But for the purposes intended here it may be sufficient.


 <   ^   > 

Survey Basic Terms Message Information