- From: Cory Benfield <cory@lukasa.co.uk>
- Date: Thu, 8 May 2014 16:24:57 +0100
- To: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
- Cc: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>, Daniel Stenberg <daniel@haxx.se>, James M Snell <jasnell@gmail.com>, "K.Morgan@iaea.org" <K.Morgan@iaea.org>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, "C.Brunhuber@iaea.org" <C.Brunhuber@iaea.org>
On 7 May 2014 10:35, RUELLAN Herve <Herve.Ruellan@crf.canon.fr> wrote: > The intent is to allow duplicates in the header set. In an ideal world, it would > be an 'actual set', but unfortunately, in my experimentation for building a header > compression mechanism, I found several occurrences of real-world message > headers containing duplicates. To support these use cases, HPACK has to > allow for duplicates in the header set. I'm going to update the definition to > make things clear. > > On the other hand, the reference set is an 'actual set': it contains references > to entries of the header table, and must not contain the same reference multiple > times. However, it may contain two references resolving as the same header > field, it this header field is contained in several entries of the header table. Fair enough: these didn't map to my expectations when reading the spec, but they're obviously both totally reasonable ways to define these terms. There are several moving pieces here and a fair bit of subtlety, but I'm sure I can come up with something acceptable. I'm looking forward to an updated spec so I can rewrite my entire implementation again. =) > The trick to encode a duplicate header field, is to encode it first as a literal, > adding it to the header table and to the reference set, then to encode it twice > as an index, the first index removing it from the reference set, and the second > index adding it again to the reference set and to the encoded collection of > headers. This seems like exactly the kind of behaviour that leads to someone suggesting a performance optimisation., and I'd love to bikeshed this for a moment if you'd allow me. Is there any reason that HPACK couldn't mandate that duplicate headers be forbidden in the same header set? We have the ability to join the duplicates together into a single header (with their values joined by null bytes), so it's in principle possible for HPACK encoders to make this transformation. The only argument I can see is that 'streaming' HPACK encoders (those that don't have the full set of headers available to them) aren't able to spot this optimisation up-front, and so can't make it (and therefore can't comply). I don't really feel like we need to accommodate such encoders, for two reasons. Firstly, I'm pretty sure that no service generates so many headers that a HPACK encoder couldn't deal with them in one go. Secondly, the ability to emit headers straight away is almost totally unhelpful given that, even for the largest of header sets, it's unlikely that HPACK encoding would take more than a few tens of milliseconds, well under a RTT. I accept that I'm wearing blinders here, because I deal with users who always know what headers they're going to apply, so please tell me what I'm missing.
Received on Thursday, 8 May 2014 15:25:25 UTC