Re: Unicode escape sequence | Re: draft-ietf-httpbis-header-structure-00, unicode range from Martin Thomson on 2016-12-14 (ietf-http-wg@w3.org from October to December 2016)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Wed, 14 Dec 2016 22:37:45 +1100
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: Julian Reschke <julian.reschke@gmx.de>, Alexey Melnikov <alexey.melnikov@isode.com>, Matthew Kerwin <matthew@kerwin.net.au>, Kari Hurtta <hurtta-ietf@elmme-mailer.org>, Ilari Liusvaara <ilariliusvaara@welho.com>, HTTP working group mailing list <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
Message-ID: <CABkgnnWzOhkznH2HzweNegYo4dDHE+DT0PM=eCSvVr+-Wkup1A@mail.gmail.com>

On 14 December 2016 at 21:51, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> Well, UTF-8 would also go through HPACK, but by eye-ball it seems
> that it would be more efficient.

If you have lots of ASCII still, you can probably Huffman encode,
though if you have lots of non-ASCII, you need to watch out: a three
octet UTF-8 encoded codepoint turns into (worst case) 82 bits.  Best
case is 58 bits (both of which are invalid, so maybe not).

I can't remember, is there actually a good reason why we can't just
start shoving UTF-8 in header fields?  I mean, h2 is probably OK with
this.

Received on Wednesday, 14 December 2016 11:38:17 UTC