Deal of the Month: 50% Discount on Windows 7 (Limited Amazon.com offer) Main Page | Report this Page
Science Forum Index  »  Compression Forum  »  Characterizing Source Data for Data Compression
Page 1 of 1    

Characterizing Source Data for Data Compression

Author Message
george
Posted: Sat Feb 05, 2005 4:01 am
Guest
Data Compression algorithm used today, consists of a number of
sub-algorithms. Each sub-block's
algorithm generates a suitable data for the next block. This continues
until the desired
compression is needed.

"However the flow path is to be modified for a different type of the
source data or depending on
the result of a sub-block, Certain blocks are not selected sometimes"

Considering a frequency count of the symbols present in the source
data, the frequency count is
same irrespective of its arrangement. (i.e., Though the location of the
data is changed the
occurrences count will be the same) .

Considering the "Loss less Compression" in which preservation of
the arrangement is very important,
may use algorithms change the order of the arrangement to bring
redundancy in the source data and
restores the arrangement to original on the reverse phase (i.e.,
Decompression)

Questions:
1. How the source data may be characterized?
2. How the source data may be characterized against these algorithms to
choose which algorithm to be used first.
3. How Sub Algorithms will be chosen to achieve the best compression.
 
george
Posted: Mon Feb 07, 2005 3:42 am
Guest
I agree with your statement.
However the probability distribution of source data many not fall into
the Normal distribution, for which Entropy coders work the best.

To achieve such a distribution, the data is manipulated some way such
as transformation based or prediction or of some other kind.

"I am interested in finding when the data should be applied to
transformation based algorithm such as Wavelet or predication based, of
course in both of which the transformation Co-efficients or errors in
prediction follow Normal distribution"

Finally the data follows the Normal distribution when given to any
Entropy Codes such as Arithmetic Coder or Range Coder Compression is
achieved.

I hope now the objective of the post is clear.

George Wilson
 
Matt Mahoney
Posted: Mon Feb 07, 2005 11:23 pm
Guest
george wrote:
[quote:0a56887427]I agree with your statement.
However the probability distribution of source data many not fall
into
the Normal distribution, for which Entropy coders work the best.
[/quote:0a56887427]
An entropy coder does not work better with a Normal distribution. It
works with any distribution. An arithmetic coder comes within 1 bit of
the Shannon limit. This part of the problem is solved. The hard part
is estimating the distribution accurately.

-- Matt Mahoney
 
 
Page 1 of 1    
All times are GMT - 5 Hours
The time now is Sun Nov 08, 2009 4:41 pm