Main Page | Report this Page
Science Forum Index  »  Compression Forum  »  gzlib, zlib, jzilb
Page 1 of 1    

gzlib, zlib, jzilb

Author Message
Popai
Posted: Tue Feb 01, 2005 9:35 am
Guest
Deflate algorithm is described in RFC 1951 - DEFLATE Compressed Data
Format Specification version 1.3.
(http://www.gzip.org/zlib/rfc-deflate.html).

Hash is used to speed-up search of repeating patterns into the sliding
window.
Here is a small fragment from the RFC you might be interested.

" The compressor uses a chained hash table to find duplicated strings,
using a hash function that operates on 3-byte sequences. At any
given point during compression, let XYZ be the next 3 input bytes to
be examined (not necessarily all different, of course). First, the
compressor examines the hash chain for XYZ. If the chain is empty,
the compressor simply writes out X as a literal byte and advances
one
byte in the input. If the hash chain is not empty, indicating that
the sequence XYZ (or, if we are unlucky, some other 3 bytes with the
same hash function value) has occurred recently, the compressor
compares all strings on the XYZ hash chain with the actual input
data
sequence starting at the current point, and selects the longest
match.

The compressor searches the hash chains starting with the most
recent
strings, to favor small distances and thus take advantage of the
Huffman encoding. The hash chains are singly linked. There are no
deletions from the hash chains; the algorithm simply discards
matches
that are too old. To avoid a worst-case situation, very long hash
chains are arbitrarily truncated at a certain length, determined by
a
run-time parameter.
"


Best regards,
Popai
 
 
Page 1 of 1    
All times are GMT - 5 Hours
The time now is Fri Dec 11, 2009 7:49 am