- From: <bugzilla@jessica.w3.org>
- Date: Sat, 10 Nov 2012 15:47:02 +0000
- To: public-html@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=19931 Priority: P2 Bug ID: 19931 Keywords: externalComments, NE CC: eliotgra@microsoft.com, mike@w3.org, public-html-wg-issue-tracking@w3.org, public-html@w3.org Assignee: eliotgra@microsoft.com Summary: Should not prefer byte order mark with UTF-8 QA Contact: public-html-bugzilla@w3.org Severity: normal Classification: Unclassified OS: All Reporter: bugz.ate.my.horse@cam.n0b.org Hardware: All Status: NEW Version: unspecified Component: pre-LC1 HTML/XHTML Compat. Authoring Guide (ed: Eliot Graff) Product: HTML WG In the section "Specifying a Document's Character Encoding", it is stated that polyglot markup uses UTF-8. It then says that the prefered way to indicate this encoding is with a Byte Order Mark. This is not advisable I feel due to: UTF-8 not requiring a BOM [3]; that it could cause problems with applications (apparently MSIE does or did have a problem) and programing languages (apparently inc. Java [4][5]); it causes otherwise valid ASCII to stop being ASCII. As such, I would swap the prefered method for indicating UTF inside the document and add a note about using the BOM. * By using <meta charset="UTF-8"/> (the HTML encoding declaration)(preferred). * By using the Byte Order Mark (BOM) character (could cause problems in some situations). References: [1] https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 [2] https://en.wikipedia.org/wiki/UTF-8#Byte_order_mark [3] http://www.unicode.org/faq/utf_bom.html#bom5 [4] http://bugs.sun.com/view_bug.do?bug_id=6378911 [5] http://bugs.sun.com/view_bug.do?bug_id=4508058 -- You are receiving this mail because: You are on the CC list for the bug.
Received on Saturday, 10 November 2012 15:47:04 UTC