Data Compression


Criteria

Survey Formats

Basics

Basic Terms

Symbol

Set of Symbols

Alphabet

Code

Coding

Redundancy

Information Theory

Message

Probability

Information

Entropy

Redundancy Reduction

Irrelevance Reduction

Entropy Coding

Variable Length Codes

Code Trees

Compression Methods

Data Formats


Glossary

Index


Download


www.BinaryEssence.com

Set of Symbols


A set of symbols is the reservoir of symbols for the contents. This set contains the total amount of different symbols that may be used.


If a symbol will be represented by a byte, the entire set contains 256 different symbols. This corresponds to the number of symbols addressable by a byte (0 to 255). A common character set according to ASCII or ANSI consists of 256 different symbols or characters respectively.


Examples for common sets of symbols:


small letters

    {a, b, c, d, e, f, g, ... z}

deoxyribonucleic acid (DNA)

    {A,      C,       G,      T}
     Adenin, Cytosin, Guanin, Thymin

boolean expressions

    {0,     1}
    {false, true}


The number of symbols used from the set and the probability of their usage will be decisive for the success of a compression procedure.


 <   ^   > 

Survey Basic Terms Symbol Alphabet