- From: <bugzilla@jessica.w3.org>
- Date: Fri, 24 Feb 2012 11:37:51 +0000
- To: public-html@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=16106 Summary: Clarify paragraph about character references in tokenization.html Product: HTML WG Version: unspecified Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: HTML5 spec (editor: Ian Hickson) AssignedTo: ian@hixie.ch ReportedBy: ezio.melotti@gmail.com QAContact: public-html-bugzilla@w3.org CC: mike@w3.org, public-html-wg-issue-tracking@w3.org, public-html@w3.org In the tokenization.html page, in the section "8.2.4.69 Tokenizing character references", after the table, it says: """ Otherwise, return a character token for the Unicode character whose code point is that number. If the number is in the range 0x0001 to 0x0008, 0x000E to 0x001F, 0x007F to 0x009F, 0xFDD0 to 0xFDEF, or is one of 0x000B, 0xFFFE, 0xFFFF, 0x1FFFE, 0x1FFFF, 0x2FFFE, 0x2FFFF, 0x3FFFE, 0x3FFFF, 0x4FFFE, 0x4FFFF, 0x5FFFE, 0x5FFFF, 0x6FFFE, 0x6FFFF, 0x7FFFE, 0x7FFFF, 0x8FFFE, 0x8FFFF, 0x9FFFE, 0x9FFFF, 0xAFFFE, 0xAFFFF, 0xBFFFE, 0xBFFFF, 0xCFFFE, 0xCFFFF, 0xDFFFE, 0xDFFFF, 0xEFFFE, 0xEFFFF, 0xFFFFE, 0xFFFFF, 0x10FFFE, or 0x10FFFF, then this is a parse error. """ As far as I understand, the character is still returned even if it's a parse error, but this is not clear. The current wording might suggest that the character is returned, /but/ if the number is in those ranges, then it's a parse error (and it doesn't say what should be returned). I suggest rephrasing it a bit to state explicitly that the character corresponding to that value is returned in both the cases. -- Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
Received on Sunday, 26 February 2012 04:55:05 UTC