- From: Martin Nilsson <nilsson@opera.com>
- Date: Sat, 09 Feb 2013 15:12:24 +0100
- To: ietf-http-wg@w3.org
On Sat, 09 Feb 2013 00:53:10 +0100, Mark Nottingham <mnot@mnot.net> wrote: > My .02 - > > RFC2616 implies that the range of characters available in headers is > ISO-8859-1 (while tilting the table heavily towards ASCII), and we've > clarified that in bis to recommend ASCII, while telling implementations > to handle anything else as opaque bytes. > > However, on the wire in HTTP/1, some bits are sent as UTF-8 (in > particular, the request-URI, from one or two browsers). > I don't see a reason to not UTF-8 encode all text fields. HTTP/1 forced a lot of heuristic code that tried to figure out how things where transformed on the way, and heuristics for decoders are bad. Though, as the world has moved to UTF-8, saying "opaque bytes" means UTF-8 in practice for everyone anyway. The problem are fields that ideally should be binary, say a hash for ETag. UTF-8 encoding would add 50% size there. Creating a static huffman code for the ASCII part of Unicode shouldn't be a problem, as long as there is a prefix for non-ascii bytes. /Martin Nilsson -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Received on Saturday, 9 February 2013 14:12:54 UTC