- From: <bugzilla@jessica.w3.org>
- Date: Sat, 05 Jan 2013 22:55:36 +0000
- To: public-qt-comments@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=20575 Bug ID: 20575 Summary: [QT3TS] test-case re00216 in test-set fn-matches.re Classification: Unclassified Product: XPath / XQuery / XSLT Version: Last Call drafts Hardware: PC OS: All Status: NEW Severity: normal Priority: P2 Component: XQuery 3 & XPath 3 Test Suite Assignee: oneil@saxonica.com Reporter: mike@saxonica.com QA Contact: public-qt-comments@w3.org This test does matches('qwerty','\p{IsaA0-a9}') and expects an error on the grounds that the regular expression is invalid, since "IsA0-a9" is not a recongnized group name. The specification states that "if the value of $pattern is invalid according to the rules described in 5.6.1 Regular expression syntax", and section 5.6.1 says "The regular expression syntax and semantics are identical to those defined in [XML Schema Part 2: Datatypes Second Edition] with the following [irrelevant] additions..." The reference is to XSD 1.0, but we state "Implementations of this specification may support either XSD 1.0 or XSD 1.1 or both.". The relevant syntax rule in both XSD 1.0 and XSD 1.1 is: IsBlock ::= 'Is' [a-zA-Z0-9#x2D]+ Thus this regular expression matches the syntax. In XSD 1.0, no semantics are given for a regular expression that uses an unknown block name, but it is nowhere stated that this is an error. The situation is clarified in XSD 1.1: <quote> If a string "IsX" matches the non-terminal IsBlock but X is not a recognized block name, then the expressions "\p{IsX}" and "\P{IsX}" each denote the set of all characters. Processors may ·at user option· treat both "\p{IsX}" and "\P{IsX}" as denoting the empty set, instead of the set of all characters.... Processors should issue a warning if they encounter a regular expression using a block name they do not recognize. Processors may ·at user option· treat unrecognized block names as ·errors· in the schema. Note: Treating unrecognized block names as errors increases the likelihood that errors in spelling the block name will be detected and can be helpful in checking the correctness of schema documents. However, it also decreases the portability of schema documents among processors supporting different versions of [Unicode Database]; it is for this reason that processors are allowed to treat unrecognized block names as errors only when the user has explicitly requested this behavior. </quote> We clearly have the opportunity to say something different for XPath regular expressions, but currently we do not do so. I think a clarification in the spec would be appropriate. In the meantime, based on the XSD 1.1 rules which we inherit, I propose to allow the alternative result "false". -- You are receiving this mail because: You are the QA Contact for the bug.
Received on Saturday, 5 January 2013 22:55:38 UTC