W3C home > Mailing lists > Public > xml-editor@w3.org > July to September 2006

UTF-16 and Byte Order Mark

From: Dieter Köhler <d.k@philo.de>
Date: Tue, 19 Sep 2006 18:53:51 +0200
Message-Id: <5.2.1.1.0.20060919183347.020b8e50@pop3.philo.de>
To: xml-editor@w3.org

Appendix F.1 of the XML specs presents examples about how to automatically 
detect the encoding of an entity from the first characters of an XML 
encoding declaration without a byte order mark.  These examples include 
UTF-16BE and UTF-16LE. However, section 4.3.3 says that entities encoded in 
UTF-16 MUST begin with a byte order mark.

In the light of the examples it seems that the intention of the specs is to 
demand a UTF-16 byte order mark only when no XML declaration is used.  Is 
this interpretation of the specs correct?

If the answer is "yes", I would suggest to start the second paragraph of 
sect. 4.4.3 with: "In the absence of a text declaration (or an XML 
declaration respectively) entities encoded in UTF-16 MUST ..."

If the answer is "no", I would suggest to remove the two incriminated 
examples from Appendix F.1 and to add an appropriate warning.



Dr. Dieter Köhler, M.A.
Wissenschaftlicher Assistent
Institut für Philosophie und
Studienzentrum Multimedia
Universität Karlsruhe (TH)

University address:
Institut für Philosophie der
Universität Karlsruhe (TH)
D-76128 Karlsruhe
GERMANY
Phone:       +49-(0)-721-608-2149
Direct Line: +49-(0)-721-608-7743
Fax:         +49-(0)-721-608-3084
Received on Tuesday, 19 September 2006 16:56:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:38 GMT