- From: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>
- Date: Sat, 16 May 1998 16:23:57 +0900
- To: Chris Newman <Chris.Newman@INNOSOFT.COM>
- Cc: "Martin J. Duerst" <duerst@w3.org>, ietf-charsets@ISI.EDU, murata@fxis.fujixerox.co.jp, Tatsuo_Kobayashi@justsystem.co.jp
Although I agree that I18N of e-mail should begin with UTF-8, I believe that UTF-16 provides the future of the WWW (XML, HTML, and HTTP). UTF-8 XML documents parse incorrectly very often. If the charset parameter of text/xml is absent or incorrect, a UTF-8 XML document is likely to parse incorrectly; XML parsers do not always find the charset incorrect. Thus, corrupted data will be stored in database. WWW agents will receive and return corrupted data or even completely fail. On the contrary, UTF-16 is exempt from such data corruption; because of the BOM and a bunch of 00, UTF-16 XML will either parse correctly or do not parse at all. Furthermore, error recovery is very reliable. I think that UTF-8 provides a good migration path from ASCII-only and that UTF-16 provides a very good start for new protocols or data formats. In my opinion, HTTP people did a very good job in lifting unnecessary restrictions of text/*. I hope that future protocols will do the same thing. Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp --Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Monday, 18 May 1998 15:08:35 UTC