- From: Elliotte Harold <elharo@metalab.unc.edu>
- Date: Fri, 25 Feb 2005 08:24:03 -0500
- To: public-xml-binary-comments@w3.org
The statement that "it is impossible for a format to perform identically in terms for instance of compactness or processing efficiency for a language that can be entirely captured using a single byte per character and for one that requires a multi-byte encoding" is untrue. It is certainly possible to provide equally compact and efficient data for languages like English and languages like Chinese. To do so simply choose an encoding form such as UTF-32 that does not preference one over the other. Such an encoding is suboptimal for English, but it would absolutely have the characteristic that English and Chinese would be treated equally efficiently. The point of human language neutrality is precisely to avoid preferencing one language or script over another. This would make UTF-8 an inappropriate choice here. UTF-32 is the most neutral, but as a practical matter, I suspect no one would be too peeved by UTF-16, and that's probably the most reasonable compromise for textual data. -- Elliotte Rusty Harold elharo@metalab.unc.edu XML in a Nutshell 3rd Edition Just Published! http://www.cafeconleche.org/books/xian3/ http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
Received on Friday, 25 February 2005 13:24:06 UTC