- From: Jamie Lokier <jamie@shareable.org>
- Date: Fri, 28 Mar 2008 14:05:18 +0000
- To: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- Cc: ietf-http-wg@w3.org
Frank Ellermann wrote: > > An issue I have with RFC2047 is it seems to imply every "proper" > > implementation of a HTTP reciever, which does something with received > > TEXT (such as display it), needs to have a _large_ table of known > > character set names and conversion routines. > > No, by design MIME works if the other side has no clue what it is. > > It would then see gibberish like =?us-ascii*en-GB?Q?hello_world?= > and don't know that this an odd way to say "hello world". It is > not forced to know what "encoded words" are, and if it knows this > it is not forced to support each and every charset. You are sort of making my point for me. If it's acceptable that a receiver sees gibberish text when it doesn't understand the particular encoding, and it's acceptable that it doesn't support each and every charset.... Is there a problem with transmitting binary UTF-8? It's just an "odd way to say" some i18n text. Some receivers will decode it as intended; some will show gibberish. How is that different from your example? > If some communities use koi8-r or some older charset popular in > JP this is the same issue as in ordinary Web browsers or e-mail: > > I cannot read Cyrl or Jpan scripts, it is irrelevant from my POV > how that is encoded. I thought the IETF was moving to recommend UTF-8 wherever possible nowadays? (Though I gather some people are still unhappy with Unicode, since it doesn't distinguish some ideographs which are drawn differently in different languages, and thus prefer not to use UTF-8.) -- Jamie
Received on Friday, 28 March 2008 14:05:56 UTC