- From: Roberto Peon <grmocg@gmail.com>
- Date: Thu, 17 Oct 2013 11:11:48 -0700
- To: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAP+FsNc+EQwZ_d0+wtMuZP6tojbJcs_sciR+BXJThCXk4dN_=A@mail.gmail.com>
On Thu, Oct 17, 2013 at 7:17 AM, Ilari Liusvaara < ilari.liusvaara@elisanet.fi> wrote: > On Wed, Oct 16, 2013 at 12:20:08PM -0700, Roberto Peon wrote: > > The first cut of Humman encoding is now included in the compression > > document. > > > > Please take a look and shout about the parts that you find > confusing/could > > be better expressed. > > - Lengths are marked as 8+. Are those 8 bit prefix or 0 bit prefix > (IIRC, those were 0 bit in some past versions)? > Do you mean the lengths in the huffman table? > - Is encountering a name/value without End-Of-String an error? - Is encountering a name/value with more bytes after EOS an error? > * Appears to be string delimiter for values? > We have a few options here. 1) Represent length of huffman-encoded strings as *bits* in the length field I've done this, it works, but it bloats the encoding. 2) Represent length of huffman-encoded strings as bytes and a) include an EOF (or EOS :) ) terminal to indicate when the last valid bit was read b) pad the last byte with bits from one of the 8+ bit symbols c) assume that the last character successfully decoded within the bytes is the last one intended, regardless of choice of a/b 2b suffers if we don't have characters like this, though we could manufacture one (perhaps EOF/EOS?). Hmm.. this could work better than what we have now as it would free up some a lower-bit-length code for an element of application data. 2c is probably a good idea regardless, as it means that the EOF/EOS is never going to require adding additional bytes to the encoded data. How about 2b+2c then (I'll need to regenerate the symbol tables, but that shouldn't affect anyone)? -=R > > -Ilari >
Received on Thursday, 17 October 2013 18:12:15 UTC