- From: <bugzilla@jessica.w3.org>
- Date: Thu, 15 Dec 2011 00:28:52 +0000
- To: public-html-bugzilla@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15192
Summary: section 8.1.4 Character references; section 8.2.2.2
Character encodings In section 8.2.2.2, we say, "User
agents must at a minimum support the UTF-8 and
Windows-1252 encodings, but may support more." In
section 8.1.4, we say, "The numeric character refere
Product: HTML WG
Version: unspecified
Platform: Other
URL: http://www.whatwg.org/specs/web-apps/current-work/#top
OS/Version: other
Status: NEW
Severity: normal
Priority: P3
Component: HTML5 spec (editor: Ian Hickson)
AssignedTo: ian@hixie.ch
ReportedBy: contributor@whatwg.org
QAContact: public-html-bugzilla@w3.org
CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,
public-html@w3.org
Specification: http://www.w3.org/TR/2011/WD-html5-20110525/
Multipage: http://www.whatwg.org/C#top
Complete: http://www.whatwg.org/c#top
Comment:
section 8.1.4 Character references; section 8.2.2.2 Character encodings
In section 8.2.2.2, we say, "User agents must at a minimum support the UTF-8
and Windows-1252 encodings, but may support more."
In section 8.1.4, we say, "The numeric character reference forms described
above are allowed to reference any Unicode code point other than U+0000,
U+000D, permanently undefined Unicode characters (noncharacters), and control
characters other than space characters."
What about the characters in the range 0x80 to 0x9F, which in Windows-1252
encodings are replaced with printable characters?
For example, am I allowed to use a Windows-1252 codepoint, "€", to
reference the Euro character, "€"? Does the browser have to further
interpret strings after replacing character references?
I suggest we add a note to 8.1.4 Character references:
"The numeric character references are to Unicode code points, so instead of
using character references in the range of € to Ÿ from the
Windows-1252 encoding, use the appropriate Unicode character. Instead of using
character references in the range of &#D800; to &#DFFF; as surrogate pairs
from the UTF-16 encoding, use the appropriate Unicode character."
Posted from: 96.53.31.86
User agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64;
Trident/5.0)
--
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 15 December 2011 00:28:59 UTC