- From: Mark Nottingham <mnot@mnot.net>
- Date: Mon, 20 Aug 2007 13:40:36 +1000
- To: Martin Duerst <duerst@it.aoyama.ac.jp>
- Cc: Julian Reschke <julian.reschke@gmx.de>, Paul Hoffman <phoffman@imc.org>, Apps Discuss <discuss@apps.ietf.org>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>, "Richard Ishida" <ishida@w3.org>, Felix Sasaki <fsasaki@w3.org>
On 10/06/2007, at 6:05 PM, Martin Duerst wrote: > - RFC 2616 prescribes that headers containing non-ASCII have to use > either iso-8859-1 or RFC 2047. This is unnecessarily complex and > not necessarily followed. At the least, new extensions should be > allowed to specify that UTF-8 is used. My .02; I'm concerned about allowing UTF-8; it may break existing implementations. I'd like to see the text just require that the actual character set be 8859-1, but to allow individual extensions to nominate encodings *like* 2047,without being restricted to it. For example, the encoding specified in 3987 is appropriate for URIs. However, it *has* to be explicit; I've heard some people read this requirement and think that they need to check *every* header for 2047 encoding. So, I think this means; 1) Change "Words of *TEXT MAY contain characters from character sets other than ISO-8859-1 [22] only when encoded according to the rules of RFC 2047 [14]." to "Words of *TEXT MUST NOT contain characters from character sets other than ISO-885901 [22]." and, 2) Identify headers that may have non-8859 content and explicitly say how to encode them (IRI, 2047, whatever; the existing ones will have to be 2047, I believe), modifying their BNF to suit. 3) When we document extensibility, require new headers to nominate any encoding explicitly. -- Mark Nottingham http://www.mnot.net/
Received on Monday, 20 August 2007 03:40:59 UTC