RE: ACTION A-645-07: schema for serialization parameters

> -----Original Message-----
> From: C. M. Sperberg-McQueen [mailto:cmsmcq@blackmesatech.com]
> 
> I think the whitespace rules of XPath say that this should be legal,
> since the grammar given in XPath is
> 
> [117] URIQualifiedName ::= BracedURILiteral NCName
> [118] BracedURILiteral ::= "Q" "{" [^{}]* "}"
> 
> and neither rule carries the /* ws:explicit */ annotation.
> 
> So the new schema makes these values legal.
> 

I'm curious, where did you see the production *without* the ws:explicit? Because in XPath 3.0, 3.1 public CR and 3.1 internal CR I see the following production rules (under section A.2.1 Terminal Symbols, the inline rules in the body of the text do not carry the ws:explicit comments, which is perhaps a bit unfortunate):

[117]    URIQualifiedName    ::=    BracedURILiteral NCName  /* ws: explicit */
[118]    BracedURILiteral    ::=    "Q" "{" [^{}]* "}" /* ws: explicit */

In other words, whitespace is prohibited for this production.

> 2 The literal "Q{ }bar" is accepted by the status-quo schema and
> rejected by the new one.  The old one uses the pattern
> 
>   (.*\{.+\}.*)
> 
> to force the namespace name to be non-empty; the new schema uses
> 
>   Q\{(.*\S.*)\}.*

I think that both the old and the new rules are too lenient. I propose the following regular expression instead:

Q\{[^{} \t\r\n]+\}.*

In addition, a further improvement would be to disallow "Q{}" (i.e., without following ncname) and to force non-space chars all over:

Q\{[^{} \t\r\n]+\}\S+

PS: I will test the new XSD against the MS validator as well today.

Cheers,
Abel

Received on Tuesday, 21 June 2016 11:25:10 UTC