Re: ACTION A-645-07: schema for serialization parameters

On Jun 25, 2016, at 6:55 AM, Abel Braaksma wrote:

> On the whitespace issue, whether it is allowed or disallowed, I stand corrected. I thought from last meeting that the WG agreed that it was disallowed unless specified. Here it is implicitly specified. So I looked up what we say in the XPath text on EQName and there we write:
> 
> "The namespace URI value is whitespace normalized according to the rules for the xs:anyURI type in Section 3.2.17 anyURI" 
> 
> While I couldn't find anything on whitespace normalization in the XSD spec,

The type xsd:anyURI assigns the value 'collapse' to the whitespace facet,
which means leading and trailing whitespace is stripped (I ought to have
done tests for that, too, perhaps) and each non-empty sequence of internal 
whitespace is collapsed to a single blank.

http://www.w3.org/TR/xmlschema11-2/#anyURI


> it does say, in the XSD spec:
> 
> "Spaces are, in principle, allowed in the ·lexical space· of anyURI, however, their use is highly discouraged (unless they are encoded by %20)."

This is a legacy issue:  the IRI draft relied on by XSD 1.0 allowed blanks; the
final draft did not, but some members of the XSD 1.1 work group refused to
align the 1.1 definition with the IRI spec, because they thought it would be 
better for the XSD anyURI type to differ from the definition of IRIs in RFC 
3987 and successors than for an anyURI value valid according to XSD 1.0 
to become invalid against XSD 1.1.  There was no consensus, and the blank 
was retained.  (This is why in my note I said that the URI and IRI specs don't 
allow blanks, not that the XSD spec for anyURI doesn't.  As unhelpful as the
relevant RFCs are for deciding what should and what should not count as 
a URI or IRI, XSD is in some ways worse.)

> 
> So, I would assume that normalization does *not* take place (I will raise an XPath bug for this, as the specs seems to disagree on this amongst each other).

No, I don't think so.  The XPath spec is consistent with XSD here, I think.

> 
> In short: I agree with your assessment that whitespaces SHOULD  be allowed

I don't think I said that, at least not in the sense that I think it's a good idea
to allow whitespace in URIs or to allow whitespace within the brackets in 
an EQName.  I think I said that the XPath grammar currently seems 
to allow them, and that I believe the schema for serialization parameters 
should align with the XPath grammar.  

> (that it is discouraged does not mean it is disallowed). But it may introduce a new issue, as currently we specify whitespace="collapse", which may not be the right thing to do.
> 
> I will respond on the other issue in-line in a follow-up mail.
> 
> (musings: if whitespace is allowed in a namespace URI, how can that namespace ever be used in an xsi:schemaLocation attribute?)

It cannot be; that is one of the reasons some members of the XSD 1.1 WG would
have preferred to update the definition of anyURI to align with what the
IRI spec said in the end.  Since XSD 1.1 did not make that change, the
consequence is merely that there are legal values of xs:anyURI which cannot
usefully be used in the xsi:schemaLocation attribute.  (And values in the
value space of anyURI which can never be accepted from any string that
undergoes whitespace normalization, e.g. those with adjacent blanks.)

Michael

-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com 
* http://cmsmcq.com/mib                 
* http://balisage.net
****************************************************************

Received on Saturday, 25 June 2016 20:10:37 UTC