- From: Poul-Henning Kamp <phk@phk.freebsd.dk>
- Date: Thu, 01 Oct 2015 06:33:06 +0000
- To: Mark Nottingham <mnot@mnot.net>
- cc: HTTP Working Group <ietf-http-wg@w3.org>
-------- In message <D107F92F-F930-44AE-945A-9170389DFCC4@mnot.net>, Mark Nottingham wri tes: >We're belatedly adopting this; Julian asked for a breather while he >finished other work, and now he's ready to commence. I think adopting the draft is a good idea. But I find some bits of the low level mechanics proposed troublesome. For instance it worries me a lot to use '*' as magic marker in fields which are historically thrown around fast and loose in all sorts of programming environments where it may or may not be a meta-character. Can we find a less overloaded preferably non-meta character ? If we can find two less overloaded characters, one can indicate UTF-8, and the other that char set is explictly specified. Judging from experience, these headers are going to vary a lot, so if we can shave 5 characters of their length in the usual case, that's a tangible benefit. Something like: UTF-8 implied: foo: bar; title<='en'%C2%A3%20rates Charset explicitly specified: foo: bar; title>=iso-8859-1'en'%A3%20rates (Where I'm not specifically proposing '<' or '>' but merely using them for the example.) But going even further: I have a hard time coming up with a credible (ie: non-demented) scenario for having multiple different charsets in the same header. Therefore I would prefer to put the charset at the front of the headers: UTF-8 implied: foo: = bar; title='en'%C2%A3%20rates Charset explicitly specified: foo: =iso-8859-1= bar; title='en'%A3%20rates Some advantages: * Very like to break in the majority of code which doesn't understand the new convention. (ref: "Postel Was Wrong") * Header compression algorithms can be smart about it. * Charset can be converted transparently by proxies, servers, frameworks etc. And we can go even further if we want to: If header contains a charset spec (as above) the rest of the header can use all byte values from the range [0x20-0xff] and %xx encoding/decoding SHALL NOT be performed. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.
Received on Thursday, 1 October 2015 06:33:46 UTC