- From: by way of <veillard@redhat.com>
- Date: Wed, 03 Apr 2002 14:04:08 -0500
- To: W3C XML Schema Comments list <www-xml-schema-comments@w3.org>
Apparently the regular expressions given for matching XPath
strings are incorrect:
XMLSchema.xsd around line 1061 (there is a similar problem for
the "selector" definition).
-------------------
<xs:attribute name="xpath" use="required">
<xs:simpleType>
<xs:annotation>
<xs:documentation>A subset of XPath expressions for use
in selectors</xs:documentation>
<xs:documentation>A utility type, not for public
use</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:token">
<xs:annotation>
<xs:documentation>The following pattern is intended to allow XPath
expressions per the following EBNF:
Selector ::= Path ( '|' Path )*
Path ::= ('.//')? Step ( '/' Step )*
Step ::= '.' | NameTest
NameTest ::= QName | '*' | NCName ':' '*'
child:: is also allowed
</xs:documentation>
</xs:annotation>
<xs:pattern
value="(\.//)?(((child::)?((\i\c*:)?(\i\c*|\*)))|\.)(/(((child::)?((\i\c*:)?(\i\c*|\*)))|\.))*(\|(\.//)?(((child::)?((\i\c*:)?(\i\c*|\*)))|\.)(/(((child::)?((\i\c*:)?(\i\c*|\*)))|\.))*)*">
</xs:pattern>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
-------------------
Accordingly to the documentation, and also XPath path definition, one
would not expect "a:1" to be matched isn't it ?
Though I was surprized by the result of one of my tests:
paphio:~/regexp -> ./testRegexp
'(\.//)?(((child::)?((\i\c*:)?(\i\c*|\*)))|\.)(/(((child::)?((\i\c*:)?(\i\c*|\*)))|\.))*(\|(\.//)?(((child::)?((\i\c*:)?(\i\c*|\*)))|\.)(/(((child::)?((\i\c*:)?(\i\c*|\*)))|\.))*)*'
'a:1'
a:1: Ok
paphio:~/regexp ->
which can be reduced to:
paphio:~/regexp -> ./testRegexp '(\i\c*)' 'a:1'
a:1: Ok
paphio:~/regexp ->
Ah ... Back to the definition:
http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#nt-MultiCharEsc
\c the set of name characters, those matched by NameChar
Which point to:
http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-NameChar
[4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' |
CombiningChar | Extender
and ':' is part of NameChar as defined by the XML specification.
XPath when defining the NameTest actually referenced the Namespace in XML REC
to avoid this problem:
http://www.w3.org/TR/xpath#NT-NameTest
http://www.w3.org/TR/REC-xml-names/#NT-NCName
[4] NCName ::= (Letter | '_') (NCNameChar)* /* An XML Name, minus the
":" */
[5] NCNameChar ::= Letter | Digit | '.' | '-' | '_' | CombiningChar |
Extender
Conclusion:
- The Schemas for Schemas does not define a subset of XPath, damn...
Suggestions:
- Change the definition to base it either on the XPath specification or
Namespace in XML specs references
- Change the regular expression occurences of "\i\c*"
- For the record I concur with others on the opinion that
XML Schema Part 1: Structures is nearly impossible to understand
and should be rewritten with more prose an without sentences longer than
25 words.
Question:
- Did anyone implemented Schemas by actually reading the specification
and transcribing it litterally using a programming language ?
thanks,
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
Received on Wednesday, 3 April 2002 14:07:05 UTC