- From: Roberto Peon <grmocg@gmail.com>
- Date: Thu, 17 Oct 2013 17:45:18 -0700
- To: Fred Akalin <akalin@google.com>
- Cc: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>, HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAP+FsNfinXx6DxZucc=XkD42fscJq4WB_WRrzY9GznLZztQ71A@mail.gmail.com>
In other words, we can construct the code table such that padding with zeroes is always acceptable, but it requires care in the construction of the code table. -=R On Thu, Oct 17, 2013 at 5:44 PM, Roberto Peon <grmocg@gmail.com> wrote: > That is probably what would happen. > There is an issue (in github already) about inverting the prefix sense > (i.e. 1->0 and vice versa) so that padding is zeros, however the part one > must be very careful about is that this is not guaranteed to happen for any > entry. > -=R > > > On Thu, Oct 17, 2013 at 5:38 PM, Fred Akalin <akalin@google.com> wrote: > >> Random proposal, not sure if it's feasible. >> >> Would it be possible to explicitly map EOS to '0000000', and therefore we >> can just say to pad with 0s until the next byte boundary? >> >> >> On Thu, Oct 17, 2013 at 5:29 PM, Roberto Peon <grmocg@gmail.com> wrote: >> >>> I know, I could instead state that the padding must be any code >= 7 >>> bits, but then implementations have to choose which character is to be used >>> for this, and that might vary based on the frequency table. Bleh. >>> >>> By using EOS, we guarantee that when the table changes, then the symbol >>> one encodes as padding is always the same symbol. >>> >>> Do you want to reopen the issue so we discuss it at Vancouver, or do you >>> find that persuasive? >>> >>> -=R >>> >>> >>> On Thu, Oct 17, 2013 at 5:14 PM, Ilari Liusvaara < >>> ilari.liusvaara@elisanet.fi> wrote: >>> >>>> On Thu, Oct 17, 2013 at 02:32:19PM -0700, Roberto Peon wrote: >>>> > On Thu, Oct 17, 2013 at 2:15 PM, Fred Akalin <akalin@google.com> >>>> wrote: >>>> > > >>>> > > - I may be missing something, but I'm not sure why we need a string >>>> > > literal to be both length-delimited and have an end marker. I'd >>>> prefer just >>>> > > having the length and assigning the short encoding for EOS to >>>> something >>>> > > else. >>>> > > >>>> > >>>> > Since length is in bytes instead of bits, but huffman encoded things >>>> are >>>> > bit based, you need either to represent length in bits (which makes >>>> the >>>> > lengths bigger), or you need to ensure that the padding used to get >>>> to the >>>> > next byte boundary at the end of the string cannot be interpreted as a >>>> > valid huffman-code. >>>> >>>> You don't need EOS code for this. >>>> >>>> This is because there are more than 128 source symbols and target >>>> alphabet >>>> is binary, and thus there exists 8+ bit code. Prefix of that can act as >>>> padding, it can't be mixed up with valid symbol. >>>> >>>> -Ilari >>>> >>> >>> >> >
Received on Friday, 18 October 2013 00:45:46 UTC