- From: Miles Sabin <msabin@cromwellmedia.co.uk>
- Date: Tue, 2 Mar 1999 15:51:43 -0000
- To: "'xml-editor@w3.org'" <xml-editor@w3.org>
Hi, John Cowan suggested that I forward the following to you for consideration as an errata for XML 1.0. Cheers, Miles Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England -----Original Message----- From: Miles Sabin Sent: 02 March 1999 11:59 am To: 'xml-dev@ic.ac.uk' Subject: Encoding detection again ... I've been browsing throught the archives for an answer to this question, but I haven't been able to find anything that seems to give a completely unambiguous answer ... Appendix F of the spec say that given a document starting with the 4 octet sequence, 00 3C 00 3F I'm to infer BOM-less big-endian UTF-16, and given a document starting with, 3C 00 3F 00 I'm to infer BOM-less little-endian UTF-16. What I what to know is: why could these sequences not equally represent (respectively) big-endian UCS-2 or little-endian UCS-2? In other words, surely these octet sequences are ambiguous, and hence the encoding should be resolved definitively with either, <?xml version="1.0" encoding="UTF-16"?> or, <?xml version="1.0" encoding="ISO-10646-UCS-2"?> or an appropriate MIME header, ie., Content-type: text/xml; charset="utf-16" or, Content-type: text/xml; charset="ISO-10646-UCS-2" Just so there's no confusion ... I'm assuming: 1. Unicode == UTF-16 2. UCS-2 != UTF-16 (because UCS-2 lacks UTF-16's support for characters outside the BMP). -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
Received on Tuesday, 2 March 1999 10:59:19 UTC