Re: UTF-8 or ASCII Header Names? from James M Snell on 2013-08-14 (ietf-http-wg@w3.org from July to September 2013)

From: James M Snell <jasnell@gmail.com>
Date: Tue, 13 Aug 2013 19:34:18 -0700
To: Roberto Peon <grmocg@gmail.com>
Cc: Fred Akalin <akalin@google.com>, HTTP Working Group <ietf-http-wg@w3.org>, Martin Thomson <martin.thomson@gmail.com>
Message-ID: <CABP7Rbc2B+S-kn9Z4pO6DaCz9JSeFVn4yqL4h5wuNbs2Wx3kpA@mail.gmail.com>

It may have been agreed to at the interim, but the spec has not yet
been updated to reflect that. The header compression draft, for
instance, only says that header field names are encoded as "literal
strings" which are defined as being UTF-8.  The drafts need to be
fixed to reflect what has been decided.

On Tue, Aug 13, 2013 at 7:31 PM, Roberto Peon <grmocg@gmail.com> wrote:
> I thought we agreed about this already at the interim?
>
> Header keys are a subset of us-ascii as defined in the 1.1 spec, and are
> case insensitive as defined in the 1.1 spec.
>
> The compressor spec will/would then include a MUST transform keys to
> lowercase.
>
> -=R
>
> On Aug 13, 2013 4:03 PM, "Fred Akalin" <akalin@google.com> wrote:
>>
>> That opens another can of worms, which is Unicode string comparison.
>>
>> If not ASCII, I'd rather have header values be arbitrary octet strings and
>> for string equality to be byte-wise; then you can put UTF-8 in there if you
>> wish.
>>
>> On Tue, Aug 13, 2013 at 3:39 PM, James M Snell <jasnell@gmail.com> wrote:
>>>
>>> -1, that's certainly not the recommendation I was making.
>>>
>>> Header field names ought not be UTF-8... Allowing UTF-8 in header
>>> field values is extremely valuable.
>>>
>>> - James
>>>
>>> On Tue, Aug 13, 2013 at 3:36 PM, Fred Akalin <akalin@google.com> wrote:
>>> > I'm definitely for removing any reference to UTF-8 in the header
>>> > compression
>>> > spec, if only to avoid the giant can of worms it introduces with
>>> > lower-casing.
>>> >
>>> >
>>> > On Tue, Aug 13, 2013 at 3:26 PM, James M Snell <jasnell@gmail.com>
>>> > wrote:
>>> >>
>>> >> Thanks for catching the missing ":" ... and yes, [":"] 1*header-char
>>> >> is a much better choice.
>>> >>
>>> >> -1 to adding any "nuance" or transformations, however. Let's be clear
>>> >> and strict about this: an HTTP/2 header field name ought to always
>>> >> match... period.
>>> >>
>>> >>     LOWERALPHA = %x61-7A
>>> >>     header-char = "!" / "#" / "$" / "%" / "&" / "'" /
>>> >>                   "*" / "+" / "-" / "." / "^" / "_" /
>>> >>                   "`" / "|" / "~" / DIGIT / LOWERALPHA
>>> >>     header-name = [":"] 1*header-char
>>> >>
>>> >> We don't need any other options or "nuance" here.
>>> >>
>>> >> - James
>>> >>
>>> >>
>>> >> On Tue, Aug 13, 2013 at 3:20 PM, Martin Thomson
>>> >> <martin.thomson@gmail.com> wrote:
>>> >> > On 13 August 2013 23:08, James M Snell <jasnell@gmail.com> wrote:
>>> >> >> Recommend that we specify in both the HTTP/2 and Header Compression
>>> >> >> spec that header names MUST conform to:
>>> >> >>
>>> >> >>     LOWERALPHA = %x61-7A
>>> >> >>     header-name = "!" / "#" / "$" / "%" / "&" / "'" /
>>> >> >>                   "*" / "+" / "-" / "." / "^" / "_" /
>>> >> >>                   "`" / "|" / "~" / DIGIT / LOWERALPHA
>>> >> >>
>>> >> >> Which is the all-lower-case equivalent to the header-name
>>> >> >> definition
>>> >> >> currently in httpbis.
>>> >> >
>>> >> > Actually, it's:
>>> >> >     LOWERALPHA = %x61-7A
>>> >> >     header-char = "!" / "#" / "$" / "%" / "&" / "'" /
>>> >> >                   "*" / "+" / "-" / "." / "^" / "_" /
>>> >> >                   "`" / "|" / "~" / DIGIT / LOWERALPHA
>>> >> >     header-name = (":" / header-char) *header-char
>>> >> >
>>> >> > though this might be better:
>>> >> >     header-name = [":"] 1*header-char
>>> >> >
>>> >> > and if we're feeling especially generous:
>>> >> >     header-name = 1*(":" / header-char)
>>> >> >
>>> >> > This sounds reasonable - though I think that this needs to be a
>>> >> > little
>>> >> > more nuanced.  Header compression might describe a transformation
>>> >> > that
>>> >> > produces the limited set of values as described above, but the
>>> >> > *input*
>>> >> > to header compression needs to be a valid HTTP header (or a special
>>> >> > HTTP/2.0 :-header).
>>> >>
>>> >
>>
>>
>

Received on Wednesday, 14 August 2013 02:35:05 UTC