- From: <bugzilla@wiggum.w3.org>
- Date: Sat, 15 Dec 2007 20:29:39 +0000
- To: www-xml-schema-comments@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=5321 Summary: REs are not production nonterminals Product: XML Schema Version: 1.0/1.1 both Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Datatypes: XSD Part 2 AssignedTo: cmsmcq@w3.org ReportedBy: davep@iit.edu QAContact: www-xml-schema-comments@w3.org 1. It appears that in the productions defining REs in Appendix G, we use REs (usually character classes) as though they were nonterminals. An example is the production for normal characters: Char ::= [^.\?*+{}()|#x5B#x5D] In productions, normally each nonterminal is the LHS of a production, and each terminal is a character string denoting itself. An RE other than a single character string denoting itself is neither. In the appendix, terminals are quoted strings and nonterminals are names linked to their defining production. These neither-fish-nor-fowl REs are displayed as unquoted strings. Perhaps they could be hyperlinked to a paragraph describing this modification to the standard production system. (But can the necessary productions for character classes be made without circularity? That may need some thought) 2. Similarly, "#-escapes" representing characters via their Unicode code numbers are not normally allowed in our REs--at least I can't find anything that allows them. Nor can I find anything that makes an exception for REs-that-are-nonterminals-in-productions. At least a note, and some kind of special treatment within the RE seems appropriate. (Actually, I wish that the codes were explained in a text note near each use of such codes; I suspect that I'm not the only reader who doesn't have the codes memorized. Perhaps the special treatment could be a hyperlink to such an explanation.) We do not currently define the production system we currently use. If we really want to have a non-standard production system which allows REs as additional RHS components, we need to define it. However, I think since we use the production system to define the REs, this could get very circular unless we are both careful, and lucky that the circularity can be avoided. Expressing a small positive character class as an "or" of single characters is easy enough. But I'm not sure how to deal with a large character class, such as the negative character class of the production quoted above.
Received on Saturday, 15 December 2007 20:29:46 UTC