UTF-16 and Byte Order Mark

Appendix F.1 of the XML specs presents examples about how to automatically 
detect the encoding of an entity from the first characters of an XML 
encoding declaration without a byte order mark.  These examples include 
UTF-16BE and UTF-16LE. However, section 4.3.3 says that entities encoded in 
UTF-16 MUST begin with a byte order mark.

In the light of the examples it seems that the intention of the specs is to 
demand a UTF-16 byte order mark only when no XML declaration is used.  Is 
this interpretation of the specs correct?

If the answer is "yes", I would suggest to start the second paragraph of 
sect. 4.4.3 with: "In the absence of a text declaration (or an XML 
declaration respectively) entities encoded in UTF-16 MUST ..."

If the answer is "no", I would suggest to remove the two incriminated 
examples from Appendix F.1 and to add an appropriate warning.



Dr. Dieter Köhler, M.A.
Wissenschaftlicher Assistent
Institut für Philosophie und
Studienzentrum Multimedia
Universität Karlsruhe (TH)

University address:
Institut für Philosophie der
Universität Karlsruhe (TH)
D-76128 Karlsruhe
GERMANY
Phone:       +49-(0)-721-608-2149
Direct Line: +49-(0)-721-608-7743
Fax:         +49-(0)-721-608-3084

Received on Tuesday, 19 September 2006 16:56:38 UTC