Data Compression


Survey Formats


Compression Methods

Data Formats

Applications and Projects

Freeware Applications

Freeware Libraries

Prof. Applications

Reference Data Sets

Calgary Corpus

Comparisons Calgary




Reference Data Sets

It is the task of a reference data set to demonstrate the compression performance of different data formats and applications as objective as possible. For that purpose the potential spectrum of use must be covered accurately being independent from a certain implementation.

Most data formats and applications respond in different measure to the original data. The result depend massively on local redundancy distributions, distances between redundant parts, recurring symbols, the set of symbols used, and many other characteristics. Any suitable file collection should consist of files with contents reflecting these circumstances.

The most common file collections are:

Calgary Corpus

Canterbury Corpus

 <   ^   > 

External Links:

Download University of Calgary (FTP) []

Applications and Projects Professional Applications and Libraries Calgary Corpus