- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Tue, 17 Jul 2012 18:38:05 +0200
- To: Robert Brewer <fumanchu@aminus.org>
- CC: James M Snell <jasnell@gmail.com>, Amos Jeffries <squid3@treenet.co.nz>, ietf-http-wg@w3.org
On 2012-07-17 17:57, Robert Brewer wrote: > Julian Reschke wrote: >> On 2012-07-17 16:48, James M Snell wrote: >>> Tunneling 1.1 traffic via 2.0 would likely be the easy part; it's the >> >> Not even that. Given an HTTP/1.1 message containing non-ASCII octets in >> header field value, you simply don't know what unicode characters to >> map >> them to. >> >> This is not theoretical; some UAs process UTF-8 in Content-Disposition, >> some use the installation's locale character set. >> >> Yes, this is a mess, but it's not clear to me how to break out of it >> without breaking *some* setups that currently "work". >> >>> ... >>> The one thing we need to determine is: how critical is the ability to >>> support seamless down-level conversion from 2.0 to 1.1 within a >> request? >>> Is it acceptable for us to say that while 2.0 can be used to >> transport >>> 1.1 messages, the reverse is not possible. >>> ... >> >> So how do you transport a 1.1 message inside 2.0 if it contains >> non-ASCII? Treat the header field value as binary? > > Just to share a field note: The Python web community dealt with this exact problem recently with the advent of Python 3, which elevated Unicode quite a bit and exposed this problem more clearly to many. The chosen solution was to take the bytes-of-unknown-encoding and decode them as ISO-8859-1 (which at least won't error on any byte sequence), and leave that mess for a higher layer (which presumably would have more context) to re-encode/decode if they liked. Not a perfect solution but better than nothing. That's also what APIs like XMLHTTPRequest and the servlet API are doing. Best regards, Julian
Received on Tuesday, 17 July 2012 16:39:05 UTC