W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2016

Re: Unicode escape sequence | Re: draft-ietf-httpbis-header-structure-00, unicode range

From: Poul-Henning Kamp <phk@phk.freebsd.dk>
Date: Wed, 14 Dec 2016 07:42:08 +0000
To: Kari Hurtta <hurtta-ietf@elmme-mailer.org>
cc: Ilari Liusvaara <ilariliusvaara@welho.com>, HTTP working group mailing list <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
Message-ID: <36792.1481701328@critter.freebsd.dk>
--------
In message <201612140628.uBE6SO3L025885@shell.siilo.fmi.fi>, Kari Hurtta writes
:

>I think that one escape sequence is more sane than something like
>\uD834\uDD1E  for one unicode codepoint.
>
>> Any suggestions ?
>
>Ilari Liusvaara told that 10FFFD is the last codepoint. So 6
>hex digits is sufficient.

I'm totally agnostic on this one, but would lean on doing it
like JSON according to Occams Razor.

If we do something different, does the HPACK-Huffman efficiency matter ?

>	( "\" "X" 6*HEXDIG )

HPACK: 19 + 8 + 6 * 5.625-ish = 61-ish bits
(lowercase 'x' would save a bit)

>	 ( "\" "X" 1*6HEXDIG "#" )

HPACK: 19 + 8 + 3-ish * 5.625-ish + 8 = 51-ish bits
(lowercase 'x' would save a bit)

>	 ( "\" "#" 1*6HEXDIG "#" )

HPACK: 19 + 12 + 3-ish * 5.625-ish + 12 = 60-ish bits

	( "\" "u" 4*HEXDIG )

HPACK: 19 + 6 + 4 * 5.625-ish = 47-ish bits

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
Received on Wednesday, 14 December 2016 07:42:45 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 14 December 2016 07:42:47 UTC