W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2013

Re: UTF-8 or ASCII Header Names?

From: Roberto Peon <grmocg@gmail.com>
Date: Fri, 16 Aug 2013 08:44:46 -0700
Message-ID: <CAP+FsNcKr1CGdtYhG3r-Nn5qOhVFX7JLWj8rtdSG1CLXKOsQmw@mail.gmail.com>
To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Cc: Fred Akalin <akalin@google.com>, James Snell <jasnell@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, Martin Thomson <martin.thomson@gmail.com>
The compressor could be aware or the content of fields, but right now, with
the sole exception of the cookie field, it has no need to do so, so why
impose what is unnecessary, and potentially harmful?

Why should the compressor be required to do utf-8 normalization?

Such requirements belong at the http/1 spec level, or perhaps in the http2
spec, but not in the compressor.

The keys should be ASCII, and the values bytes. This allows for utf-8 to be
compressed as well, though compression efficiency will go down if the input
isn't normalized AND if there is a lot of variation in the encodings of a
particular string.

On Aug 16, 2013 5:31 AM, Martin J. Dürst <duerst@it.aoyama.ac.jp> wrote:

> On 2013/08/14 8:01, Fred Akalin wrote:
>> That opens another can of worms, which is Unicode string comparison.
>> If not ASCII, I'd rather have header values be arbitrary octet strings and
>> for string equality to be byte-wise; then you can put UTF-8 in there if
>> you
>> wish.
> The semantics of comparisons of header field values will depend on the
> specific header field. A field with a date is compared differently from a
> field with e.g. a domain name or an URI or whatever else.
> And using UTF-8 doesn't exclude using bitwise equality, if that makes
> sense for some header.
> But saying data is binary and then having each header have to define how
> to use charaters on top of that is really a bad idea. Just make it UTF-8,
> and be done with it.
> Regards,   Martin.
Received on Friday, 16 August 2013 15:45:13 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:14:14 UTC