W3C home > Mailing lists > Public > ietf-http-wg@w3.org > October to December 2011

json-string for HTTP header field parameter values

From: Manger, James H <James.H.Manger@team.telstra.com>
Date: Mon, 31 Oct 2011 00:07:22 +1100
To: httpbis Group <ietf-http-wg@w3.org>
Message-ID: <255B9BB34FB7D647A506DC292726F6E11291D37E94@WSMSG3153V.srv.dir.telstra.com>
HTTP currently uses token and quoted-string for various header field parameter values, and recommends these syntaxes for new headers. However neither supports Unicode, which isn't really acceptable today.

I would like to recommend the JSON string syntax for new header field parameter values. JSON is very widely used on the web, particularly by protocols built on HTTP. There are JSON implementations for basically every computer language. JSON support the full range of Unicode characters. Developers love it.

A JSON string: is enclosed in double quotes; uses \" and \\ to represent " and \; uses six other \x sequences for other chars; and allows \uXXXX as an escape sequence for any Unicode character [json.org, RFC4627]. An HTTP header profile of JSON string would require any chars outside the printable ASCII set to be escaped.

RFC5987 "Character Set and Language Encoding for HTTP Header Field Parameters" already offer one way to represent any Unicode string in a HTTP header parameter value, eg foo*=UTF-8''coll%C3%A8gues. However this is not very appealing when defining a new parameter. HTTPbis-p2 already recommends new parameters allow the token and quoted-string syntaxes so supporting RFC5987 for Unicode means implementations have to support 2 parameter names (foo and foo*), 3 syntaxes, and 2 escaping mechanisms (\x in quoted-string, and %xx in RFC5987) -- all for a brand new parameter. Yuck.

I think the considerations for new headers (issue #231), and advice on defining auth scheme parameters (issue #320), should consider how to support Unicode parameter values -- and json-string would be a good way to do that.

P.S. json-string could also work in practice in places where quoted-string is defined (such as for parameters of new authentication schemes), since no actual quoted-string value will ever have escaped 'u' as '\u' so '\uXXXX' could be safely interpreted as per JSON instead of as 'uXXXX' as per quoted-string rules.

James Manger
Received on Sunday, 30 October 2011 13:07:58 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:13:54 UTC