- From: <bugzilla@jessica.w3.org>
- Date: Thu, 15 Dec 2011 00:28:52 +0000
- To: public-html-bugzilla@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15192 Summary: section 8.1.4 Character references; section 8.2.2.2 Character encodings In section 8.2.2.2, we say, "User agents must at a minimum support the UTF-8 and Windows-1252 encodings, but may support more." In section 8.1.4, we say, "The numeric character refere Product: HTML WG Version: unspecified Platform: Other URL: http://www.whatwg.org/specs/web-apps/current-work/#top OS/Version: other Status: NEW Severity: normal Priority: P3 Component: HTML5 spec (editor: Ian Hickson) AssignedTo: ian@hixie.ch ReportedBy: contributor@whatwg.org QAContact: public-html-bugzilla@w3.org CC: mike@w3.org, public-html-wg-issue-tracking@w3.org, public-html@w3.org Specification: http://www.w3.org/TR/2011/WD-html5-20110525/ Multipage: http://www.whatwg.org/C#top Complete: http://www.whatwg.org/c#top Comment: section 8.1.4 Character references; section 8.2.2.2 Character encodings In section 8.2.2.2, we say, "User agents must at a minimum support the UTF-8 and Windows-1252 encodings, but may support more." In section 8.1.4, we say, "The numeric character reference forms described above are allowed to reference any Unicode code point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), and control characters other than space characters." What about the characters in the range 0x80 to 0x9F, which in Windows-1252 encodings are replaced with printable characters? For example, am I allowed to use a Windows-1252 codepoint, "€", to reference the Euro character, "€"? Does the browser have to further interpret strings after replacing character references? I suggest we add a note to 8.1.4 Character references: "The numeric character references are to Unicode code points, so instead of using character references in the range of € to Ÿ from the Windows-1252 encoding, use the appropriate Unicode character. Instead of using character references in the range of &#D800; to &#DFFF; as surrogate pairs from the UTF-16 encoding, use the appropriate Unicode character." Posted from: 96.53.31.86 User agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0) -- Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Thursday, 15 December 2011 00:28:59 UTC