W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2016

draft-ietf-httpbis-header-structure-00, unicode range

From: Kari Hurtta <hurtta-ietf@elmme-mailer.org>
Date: Tue, 13 Dec 2016 19:33:26 +0200 (EET)
To: HTTP working group mailing list <ietf-http-wg@w3.org>
CC: Poul-Henning Kamp <phk@varnish-cache.org>, Kari Hurtta <hurtta-ietf@elmme-mailer.org>
Message-Id: <20161213173327.C1F7D1714B@welho-filter2.welho.com>
2.  Definition of HTTP Header Common Structure
https://tools.ietf.org/html/draft-ietf-httpbis-header-structure-00#section-2

|     unicode_string = * unicode_codepoint
|             # XXX: Is there a place to import this from ?
|             # Unrestricted unicode, because there is no sane
|             # way to restrict or otherwise make unicode "safe".

What is range of unicode_codepoint ?

Next section implies that it is only plane 0 ?
( range 0x0000 - 0xFFFF )

3.  HTTP/1 Serialization of HTTP Header Common Structure
https://tools.ietf.org/html/draft-ietf-httpbis-header-structure-00#section-3

|     h1_unicode_string = DQUOTE *(
|                         ( "\" DQUOTE )
|                         ( "\" "\" ) /
|                         ( "\" "u" 4*HEXDIG ) /
|                         0x20-21 /
|                         0x23-5B /
|                         0x5D-7E /
|                         UTF8-2 /
|                         UTF8-3 /
|                         UTF8-4
|                         ) DQUOTE
|     # This is UTF8 with HTTP1 unfriendly codepoints
|     # (00-1f, 7f) neutered with \uXXXX escapes.


here seems unicode to limited to plane 0
( 0x0000 - 0xFFFF )

Or is unicode values > 0xFFFF
encoded with surrogates  (values 0xd8000 - 0xdffff) ?
( UCS-2 or UTF-16 is used )

/ Kari Hurtta
Received on Tuesday, 13 December 2016 17:34:02 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 13 December 2016 17:34:05 UTC