- From: James Clark <jjc@jclark.com>
- Date: Sun, 9 Sep 2012 10:50:53 +0700
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: public-microxml@w3.org
- Message-ID: <CANz3_Ea4+y8Z7BUwCGrnAE-z97vX_4Zvv9u9pzb1gkE2S5xHTQ@mail.gmail.com>
Writing the production for char like this would, I think, make the logic behind the definition clearer: char ::= s | ([#x0-#x10FFFF] - forbiddenChar) forbiddenChar ::= controlCodePoint | surrogateCodePoint | nonCharacterCodePoint controlCodePoint ::= [#x0-#1F] | [#x7F-#9F] # The 66 noncharacters defined by Unicode nonCharacterCodePoint ::= [#xFDD0-#xFDEF] | [#xFFFE-#xFFFF] | [#x1FFFE-#x1FFFF] | [#x2FFFE-#x2FFFF] | [#x3FFFE-#x3FFFF] | [#x4FFFE-#x4FFFF] | [#x5FFFE-#x5FFFF] | [#x6FFFE-#x6FFFF] | [#x7FFFE-#x7FFFF] | [#x8FFFE-#x8FFFF] | [#x9FFFE-#x9FFFF] | [#xAFFFE-#xAFFFF] | [#xBFFFE-#xBFFFF] | [#xCFFFE-#xCFFFF] | [#xDFFFE-#xDFFFF] | [#xEFFFE-#xEFFFF] | [#xFFFFE-#xFFFFF] | [#x10FFFE-#x10FFFF] The definition of nameStartChar also needs to exclude nonCharacterCodePoints, eg by changing the last bit to ([#xF900-#xEFFFF] - nonCharacterCodePoint) James On Sun, Sep 9, 2012 at 3:45 AM, John Cowan <cowan@mercury.ccil.org> wrote: > James Clark scripsit: > > > I would either leave the list out completely or put it in the syntax. > > Oh, in the syntax, absolutely. > > -- > Barry thirteen gules and argent on a canton azure John Cowan > fifty mullets of five points of the second, cowan@ccil.org > six, five, six, five, six, five, six, five, and six. > --blazoning the U.S. flag http://www.ccil.org/~cowan >
Received on Sunday, 9 September 2012 03:51:41 UTC