- From: Fred Akalin <akalin@google.com>
- Date: Thu, 17 Oct 2013 18:56:54 -0700
- To: Roberto Peon <grmocg@gmail.com>
- Cc: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>, HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CANUYc_RKHWDvpoZmgOUKsgj3UrF2YuHNojccnebs2pUM6h7Y7w@mail.gmail.com>
An alternative is to mandate the padding character, e.g. say that it must be the least code lexicographically that has >= 8 (not >= 7) bits. Or just arbitrarily pick one and christen it the EOS bit, too. Not sure if it's worth saving an 8+ code, though. On Thu, Oct 17, 2013 at 5:29 PM, Roberto Peon <grmocg@gmail.com> wrote: > I know, I could instead state that the padding must be any code >= 7 bits, > but then implementations have to choose which character is to be used for > this, and that might vary based on the frequency table. Bleh. > > By using EOS, we guarantee that when the table changes, then the symbol > one encodes as padding is always the same symbol. > > Do you want to reopen the issue so we discuss it at Vancouver, or do you > find that persuasive? > > -=R > > > On Thu, Oct 17, 2013 at 5:14 PM, Ilari Liusvaara < > ilari.liusvaara@elisanet.fi> wrote: > >> On Thu, Oct 17, 2013 at 02:32:19PM -0700, Roberto Peon wrote: >> > On Thu, Oct 17, 2013 at 2:15 PM, Fred Akalin <akalin@google.com> wrote: >> > > >> > > - I may be missing something, but I'm not sure why we need a string >> > > literal to be both length-delimited and have an end marker. I'd >> prefer just >> > > having the length and assigning the short encoding for EOS to >> something >> > > else. >> > > >> > >> > Since length is in bytes instead of bits, but huffman encoded things are >> > bit based, you need either to represent length in bits (which makes the >> > lengths bigger), or you need to ensure that the padding used to get to >> the >> > next byte boundary at the end of the string cannot be interpreted as a >> > valid huffman-code. >> >> You don't need EOS code for this. >> >> This is because there are more than 128 source symbols and target alphabet >> is binary, and thus there exists 8+ bit code. Prefix of that can act as >> padding, it can't be mixed up with valid symbol. >> >> -Ilari >> > >
Received on Friday, 18 October 2013 01:57:23 UTC