Re: Understanding how HPAC draft-02 works from Tatsuhiro Tsujikawa on 2013-09-06 (ietf-http-wg@w3.org from July to September 2013)

From: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>
Date: Fri, 6 Sep 2013 23:14:17 +0900
To: Roberto Peon <grmocg@gmail.com>
Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <CAPyZ6=JNY-AgdR-S-LXNLHo_A11b8bbVcof-idZD-dhGHDXYFQ@mail.gmail.com>
I simplified my previous HPACK draft-03 encoder algorithm a bit and make it
one-pass encoder.
The thing is that we don't have to emit toggle off first. After all current
header set is
processed, the entries in the reference set and not emitted and not common
header are to be
removed. To handle eviction of common header, emit 2 indexed representation
for it just before
the removal.

1. For each name/value pair in the current header set:

1.1. If name/value pair is present in the header table, and the
     corresponding entry in the header table is NOT in the
     reference set, add the entry to the reference set and encode
     it as indexed representation. Mark the entry "emitted".

1.2. If name/value pair is present in the header table, and the
     corresponding entry in the header table is in the reference
     set:

1.2.1. If the entry is marked as "common-header", then this is
       the 2nd occurrence of the same indexed representation. To
       encode this name/value pair, we have to encode 4 indexed
       representation. 2 for the 1st one (which was the
       name/value pair processed in 1.2.3.), and the another 2
       for the current name/value pair.  Unmark the
       entry "common-header" and mark it "emitted".

1.2.2. If the entry is marked as "emitted", then this is also the
       occurrences of the same indexed representation. But this time,
       we just encode 2 indexed representations.

1.2.3. Otherwise, just mark the entry "common-header" and not
       encode it at the moment.

1.3. If name/value pair is not present in the header table,
     encoder encodes name/value pair as literal representation.
     On eviction or substitution, If the entry to be removed is
     in the reference set and marked as "common-header", encode
     it as 2 indexed representations before the removal. On
     removal, it is removed from the reference set.

2. For each entry in the reference set: if the entry is in the
   reference set but is neither marked as "emitted"
   nor "common-header", remove it from the reference set and
   encode its index as indexed representation.

3. After all current header set is processed, unmark all entries in
   the header table.

Best regards,

Tatsuhiro Tsujikawa


On Sat, Aug 24, 2013 at 10:56 PM, Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com
> wrote:

>
> On Sat, Aug 24, 2013 at 7:06 PM, Roberto Peon <grmocg@gmail.com> wrote:
>
>> Correct, which is why emitting references first is an easy optimization :)
>>
>> You should take a look at the pseudo-code here:
>> http://tools.ietf.org/html/draft-rpeon-httpbis-header-compression-03#section-10
>> This is for the delta2 encoding scheme, which is a little different, but
>> the approach to dealing with this issue is in there, and is not so bad.
>>
>>
> Thank you for the pointer. It is very insightful. So the encoder should do
> a bit more processing to make the
> decoder simple.
>
> For the record, here is the revised algorithm, which does not require
> header emissions on eviction in decoding, so it is conforming to the draft.
> The key point is that, when encoding, if "common-header" is removed from
> the header table, keep track of it and later encode it just like the other
> name/value pairs.
>
> 0. keep_set is initialized as empty set.
>
> 1. For each entry in the reference set, check that it is present
>    in the current header set. If it is not, encode it as indexed
>    representation and remove it from the reference set.
>
> 2. For each entry in the reference set, check that it is present
>    in the current header set. If it is present, mark the entry
>    as "common-header" and remove the matching name/value pair
>    from current header set (if multiple name/value pairs are
>    matched, only one of them is removed from the current header
>    set).
>
> 3. Encode the rest of name/value pair in current header set. For each
>    name/value pair:
>
> 3.1. If name/value pair is present in the header table, and the
>      corresponding entry in the header table is NOT in the
>      reference set, add the entry to the reference set and encode
>      it as indexed representation. Mark the entry "emitted".
>
> 3.2. If name/value pair is present in the header table, and the
>      corresponding entry in the header table is in the reference
>      set: If the entry is marked as "common-header", then this is
>      the 2nd occurrence of the same indexed representation. To
>      encode this name/value pair, we have to encode 4 indexed
>      representation. 2 for the 1st one (which was removed in step
>      2), and the another 2 for the current name/value pair.
>      Unmark the entry "common-header" and mark it "emitted".
>
>      If the entry is marked as "emitted", then this is also the
>      occurrences of the same indexed representation. But this time,
>      we just encode 2 indexed representation.
>
> 3.3. Otherwise, encoder encodes name/value pair as literal
>      representation.  On eviction or substitution, if the removed
>      entry is in the reference set, it is removed from the
>      reference set. If the removed entry is marked
>      as "common-header", add it to keep_set.
>
> 4. For each name/value of entry in keep_set, do the same
>    processing described in 3.1 thourgh 3.3. keep_set may be
>    updated in the iteration.
>
> 5. After all current header set is processed, unmark all entries in
>    the header table.
>
>
Received on Friday, 6 September 2013 14:15:04 UTC