W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > March 2011

[Bug 12062] UTF-8 BOM should not be forbidden in Polyglot Markup

From: <bugzilla@jessica.w3.org>
Date: Fri, 04 Mar 2011 23:00:43 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1PvdzH-0005jo-1f@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=12062

--- Comment #9 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2011-03-04 23:00:41 UTC ---
(In reply to comment #8)

> For comment 6, ISSUE 3 the draft now says:
> ]]
> When polyglot markup uses UTF-16, it must include a BOM indicating
> little-endian UTF-16 or big-endian UTF-16, per XML, Character Encoding in
> Entities. [XML10]
> [[
> 
> With a link to Character Encoding in Entities from XML 1.0.

I must admit an error: You were correct in using "the BOM". 

Because, the BOM is not a "mark", it is a character: U+FEFF. And since it is a
character, it is of course also a character 'of the specific encoding' used.
Therefore it should also not be necessary to add 'of the specific encoding' -
in contrast to what I earlier said.

Two consequences/proposals there of:

1) Revert "a BOM" to "the BOM", like you had it.
2) Delete the "of the specific encoding" from 
     ]]By using the BOM of the specific encoding.[[
3) I would also sugget to simply the statement about how to use the UTF-16
encoding- while it is correct that on must use the BOM to indicate little/big
endianess, I think you can just state that one must use the BOM - and nothing
more. Thus I propose to change the above

]]  When polyglot markup uses UTF-16, it must include a BOM indicating
 little-endian UTF-16 or big-endian UTF-16, per XML, Character Encoding in
 Entities. [XML10]  [[

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Friday, 4 March 2011 23:00:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 4 March 2011 23:00:49 GMT