- From: Joseph Kesselman <keshlam@us.ibm.com>
- Date: Thu, 9 Oct 2003 15:33:26 -0400
- To: Martin Duerst <duerst@w3.org>
- Cc: Francois Yergeau <FYergeau@alis.com>, Johnny Stenback <jst@w3c.jstenback.com>, "'w3c-i18n-ig@w3.org'" <w3c-i18n-ig@w3.org>, "'www-dom@w3.org'" <www-dom@w3.org>, www-dom-request@w3.org
> I wonder how the DOM is able >to make the distinction between little-endian and big-endian versions >of UTF-16. The XML Recommendation gives specific suggestions on how to guess encodings when reading from a byte stream -- use the Byte Order Mark if available, otherwise use the <? at the start of the XML Declaration/Text Declaration if one exists, otherwise make the best guess you can and if it's wrong that's the user's fault for not giving you a better set of hints to work with. The XML Rec doesn't suggest how to select which of these to use when writing out. If your serializer generates a BOM and/or a <?xml?> declaration with the encoding correctly specified, you should be fine. This doesn't strike me as being more of a problem for the DOM than it is for anyone else... ______________________________________ Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more. "The world changed profoundly and unpredictably the day Tim Berners Lee got bitten by a radioactive spider." -- Rafe Culpin, in r.m.filk
Received on Thursday, 9 October 2003 15:47:40 UTC