Data Compression


Criteria

Survey Formats

Basics

Basic Terms

Symbol

Set of Symbols

Alphabet

Code

Coding

Redundancy

Information Theory

Message

Probability

Information

Entropy

Redundancy Reduction

Irrelevance Reduction

Entropy Coding

Variable Length Codes

Code Trees

Compression Methods

Data Formats


Glossary

Index


Download


www.BinaryEssence.com

Symbol


In context of data compression and information theory a single symbol is the smallest unit structuring data. Any set of data is a collection of a corresponding amount of symbols. Type and structure of a symbol depends on the implementation.


Most procedures are using the byte as a basic unit. This is provided by the conventional computer architecture that works with the byte as the smallest addressable unit. For larger units commom systems take multiples of a byte (e.g. 2-byte or 4-byte integer values). Applications designed for a specific type of contents use more complex data structures, but internally provide a byte orientation too.


For uncompressed data usually fixed length symbols are used e.g. one byte. Most of the lossless compression procedures transfer them into variable length symbols. In this manner the algorithms represent the particular probability of a symbol.


 <   ^   > 

Survey Basic Terms Survey Basic Terms Set of Symbols