"Compression" from Paul Pierce on 2007-11-28 (public-exi@w3.org from November 2007)

From: Paul Pierce <prp@teleport.com>
Date: 28 Nov 2007 01:25:02
To: "W3C EXI Public" <public-exi@w3.org>
Message-ID: <20071127T160820Z.1.prp@teleport.com>

I have two comments about the proposed EXI compression mechanism.

First, the method of determining block size is too loose. The main reason to block data before compression is to place a bound on memory usage, which is very useful even for systems that are not memory constrained. The encoder and decoder must be able to determine how much memory might be used for a block, and for efficiency most blocks should use most or all of that memory. In the proposed mechanism, the block size is fixed in terms of variable-sized entities (AT and CH values) so while its possible in principle to determine an upper bound on memory usage for any block, it seems likely that in many cases every block in a given document would use only a small fraction of the maximum. Block size should be specified in bytes (octets.)

Second, its not clear that there is any advantage in compressing streams separately. The DEFLATE algorithm adapts, to some extent, to changing conditions in the data stream. It would be simpler to rearrange each data block into channels (in the proper order) and then run the whole block through DEFLATE.

Paul Pierce

Received on Wednesday, 28 November 2007 01:57:51 UTC