- From: Jamie Lokier <jamie@shareable.org>
- Date: Fri, 28 Mar 2008 12:42:14 -0700
- To: "Mark Nottingham" <mnot@mnot.net>
- Cc: "Robert Brewer" <fumanchu@aminus.org>, "Martin Duerst" <duerst@it.aoyama.ac.jp>, "Roy T. Fielding" <fielding@gbiv.com>, "HTTP Working Group" <ietf-http-wg@w3.org>
Mark Nottingham wrote: > Concretely, our options at this point are: > > 1) Change the character encoding on the wire to UTF-8 > 2) Leave the character encoding on the wire at ISO-8859-1, document > existing TEXT instances' encoding requirements on top of that, and > a) Require new headers that need i18n content to specify RFC2047, or > b) Require new headers that need i18n content to specify *some* > encoding into ISO-8859-1 using character escapes (which explicitly MAY > be RFC2047). An issue I have with RFC2047 is it seems to imply every "proper" implementation of a HTTP reciever, which does something with received TEXT (such as display it), needs to have a _large_ table of known character set names and conversion routines. With email this is unavoidable due to history, but it seems silly for a HTTP reciever to need it. Since RFC2047 isn't (currently) seen in practice in HTTP, if RFC2047 continues to be recommended for TEXT, may I suggest that it be recommended to _only_ designate the "utf-8", "iso-8859-1" and "us-ascii" character set in RFC2047 encodings in HTTP? That way, at least, HTTP receivers which aim for a complete, conformant implementation and expect to do something as simple as, e.g. decode and show received TEXT, will be complete by just decoding those character sets. -- Jamie
Received on Friday, 28 March 2008 19:43:41 UTC