Re: UTF-8 or ASCII Header Names? from Fred Akalin on 2013-08-13 (ietf-http-wg@w3.org from July to September 2013)

From: Fred Akalin <akalin@google.com>
Date: Tue, 13 Aug 2013 16:01:39 -0700
To: James M Snell <jasnell@gmail.com>
Cc: Martin Thomson <martin.thomson@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <CANUYc_R3c5RSEiBaYtLN1+wCDX7kQ8rbUYBrrPSRBs=EZvc_tg@mail.gmail.com>

That opens another can of worms, which is Unicode string comparison.

If not ASCII, I'd rather have header values be arbitrary octet strings and
for string equality to be byte-wise; then you can put UTF-8 in there if you
wish.

On Tue, Aug 13, 2013 at 3:39 PM, James M Snell <jasnell@gmail.com> wrote:

> -1, that's certainly not the recommendation I was making.
>
> Header field names ought not be UTF-8... Allowing UTF-8 in header
> field values is extremely valuable.
>
> - James
>
> On Tue, Aug 13, 2013 at 3:36 PM, Fred Akalin <akalin@google.com> wrote:
> > I'm definitely for removing any reference to UTF-8 in the header
> compression
> > spec, if only to avoid the giant can of worms it introduces with
> > lower-casing.
> >
> >
> > On Tue, Aug 13, 2013 at 3:26 PM, James M Snell <jasnell@gmail.com>
> wrote:
> >>
> >> Thanks for catching the missing ":" ... and yes, [":"] 1*header-char
> >> is a much better choice.
> >>
> >> -1 to adding any "nuance" or transformations, however. Let's be clear
> >> and strict about this: an HTTP/2 header field name ought to always
> >> match... period.
> >>
> >>     LOWERALPHA = %x61-7A
> >>     header-char = "!" / "#" / "$" / "%" / "&" / "'" /
> >>                   "*" / "+" / "-" / "." / "^" / "_" /
> >>                   "`" / "|" / "~" / DIGIT / LOWERALPHA
> >>     header-name = [":"] 1*header-char
> >>
> >> We don't need any other options or "nuance" here.
> >>
> >> - James
> >>
> >>
> >> On Tue, Aug 13, 2013 at 3:20 PM, Martin Thomson
> >> <martin.thomson@gmail.com> wrote:
> >> > On 13 August 2013 23:08, James M Snell <jasnell@gmail.com> wrote:
> >> >> Recommend that we specify in both the HTTP/2 and Header Compression
> >> >> spec that header names MUST conform to:
> >> >>
> >> >>     LOWERALPHA = %x61-7A
> >> >>     header-name = "!" / "#" / "$" / "%" / "&" / "'" /
> >> >>                   "*" / "+" / "-" / "." / "^" / "_" /
> >> >>                   "`" / "|" / "~" / DIGIT / LOWERALPHA
> >> >>
> >> >> Which is the all-lower-case equivalent to the header-name definition
> >> >> currently in httpbis.
> >> >
> >> > Actually, it's:
> >> >     LOWERALPHA = %x61-7A
> >> >     header-char = "!" / "#" / "$" / "%" / "&" / "'" /
> >> >                   "*" / "+" / "-" / "." / "^" / "_" /
> >> >                   "`" / "|" / "~" / DIGIT / LOWERALPHA
> >> >     header-name = (":" / header-char) *header-char
> >> >
> >> > though this might be better:
> >> >     header-name = [":"] 1*header-char
> >> >
> >> > and if we're feeling especially generous:
> >> >     header-name = 1*(":" / header-char)
> >> >
> >> > This sounds reasonable - though I think that this needs to be a little
> >> > more nuanced.  Header compression might describe a transformation that
> >> > produces the limited set of values as described above, but the *input*
> >> > to header compression needs to be a valid HTTP header (or a special
> >> > HTTP/2.0 :-header).
> >>
> >
>

Received on Tuesday, 13 August 2013 23:02:06 UTC