RE: Online validator: '[', ']', and '-' freely accepted in patterns

I can't comment on any problems there may be in XSV, but it's impossible to
write an implementation that conforms to the current spec in this area,
because the spec is internally inconsistent. The bug report has been open
for a long time:

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1889

which refers to

http://lists.w3.org/Archives/Public/www-xml-schema-comments/2005JulSep/0030.
html 

Michael Kay


> -----Original Message-----
> From: xmlschema-dev-request@w3.org 
> [mailto:xmlschema-dev-request@w3.org] On Behalf Of Dave Evans
> Sent: 09 January 2007 17:28
> To: HT@INF.ED.AC.UK
> Cc: XMLSCHEMA-DEV@W3.ORG
> Subject: Online validator: '[', ']', and '-' freely accepted 
> in patterns
> 
> 
> Hi, I have tried to use the online schema validator:
> 
>    http://www.w3.org/2001/03/webdata/xsv
> 
> I am trying to validate an instance document.  In the "Address(es):"
> input area, I have entered two URLs, separated by a space:
> 
>   http://sirius-software.com/daveweb/resources/schtst03.xml
>   http://sirius-software.com/daveweb/resources/schtst03.xsd
> 
> I think that this schema should not be accepted, because it 
> contains a pattern which is not legal according to the recommendation:
> 
>         <xs:pattern value="(&#x5B;w-x-&#x5B;y-z&#x5D;|&#x5D;)+"/>
> 
> (The pattern is equivalent to  ([w-x-[y-z]|])+ but I have 
> used character references in case square brackets don't 
> transmit well from my e-mail to yours.)
> 
> In order to parse this pattern, I believe that the final 
> right square bracket (&#x5D;) must correspond to production 
> [10] for Char, but that production explicitly excludes left 
> and right square brackets.
> 
> Similarly, within the character class, the 2nd hyphen ("-") 
> and the 2nd left square bracket (&#x5B;) each must correspond 
> to production [22] for XmlCharIncDash, but that production 
> excludes left and right square brackets, and furthermore the 
> third bullet item following the Character Range productions 
> stipulates that a hyphen can only occur at the start or end 
> of a positive character group, so the 2nd hyphen should be illegal.
> 
> I have asked before about pattern support in the online 
> validator, and I appreciated your quick feedback, telling me 
> that sometimes an instance document which is invalid due to a 
> pattern restriction is erroneously accepted as valid.
> 
> In this case, it seems to me that a schema which contains a 
> syntactically invalid pattern is accepted.
> 
> My guess is that you are using a regex engine which does not 
> have the same syntax nor semantics as pattern restrictions in 
> the XML Schema Recommendation, resulting in these "falsely 
> valid" schema.
> 
> I just wanted to check if that is the case, and would 
> appreciate any comments you may have.
> 
> Thank you again.
> 
>         Dave Evans.
>         Sirius Software, Inc.
>         875 Massachusetts Ave., Suite 21
>         Cambridge, MA 02139
>         617-876-6677 x204,  fax 617-234-1204
>         e-mail:  dme@sirius-software.com
> 

Received on Tuesday, 9 January 2007 20:57:29 UTC