RE: ACTION A-645-07: schema for serialization parameters

Responding to the other remaining issues below. In light of the follow-up discussion and the discussion in Bug 29708, I think we can be conclusive now. Bottom line:

1) I think we (now) both agree that the current production rules in XP31 allow whitespace in BracedURILiteral
2) The whitespace is collapsed (on the URI, not the BracedURILiteral as a whole)
3) no-namespace EQNames should be allowed for the predefined values html, xml etc (see below)
4) MSXML4 compatibility (no subtracting regex support) can be fixed by using the base-type approach or similar

Cheers,
Abel

> -----Original Message-----
> From: C. M. Sperberg-McQueen [mailto:cmsmcq@blackmesatech.com]
> Sent: Saturday, June 25, 2016 4:11 AM
> >
> > 0) xml:id
> > MSXML.NET: chokes on xml:id, so I removed that (should we include
> > that? Some WG's do in their schemas, I believe)
> 
> I think you are saying that MSXML.NET chokes on the xml:id attributes
> included in the test document I prepared.  Is that correct?

Yes, that is what I meant.

> If it does, then I think it's in error.  I'll spare everyone the details, because on
> closer examination I suspect you mean that MSXML.NET chokes on some but
> not all of the xml:id attributes; a copy/paste error led to the ID "json-i-v-EQ1"
> occurring on three elements.
> 
> Can you clarify? Thanks.

Yes, MSXML.NET chokes on xml:id (or anything in that namespace) when it not defined in an XSD. For instance, the XT3 tests link to xml.xsd, which seems to have been created by the XML WG. It says:

"This schema document describes the XML namespace, in a form for import by other schema documents."

Which seems to be from https://www.w3.org/2001/xml.xsd and the current version is from 2009: http://www.w3.org/2009/01/xml.xsd. I'm not sure what the policy is here, but if we use such attributes we may want to include it.

> 
> Of course, we might wish not just to allow xml:id but to include the schema
> for the XML namespace; I'm agnostic on that.

Yes, me too, not sure what the policy is here.

> 
> >
> > 1) spaces in EQName URI
> > value="Q{ http://example.com/nss/foo }bar"
> 
> Why do you believe this is (or should be) invalid?  I read the grammar of
> XPath 3.1 as saying it should be valid.  The relevant productions are[1]

I stand corrected, spaces are allowed by the productions and the text.

> 
> I think allowing whitespace within the angle brackets is probably a mistake,
> since it's not allowed in URIs or IRIs, but I was trying to match the grammar,
> not improve it.

I raised Bug 29708, but the first responders seem to agree to stick to the status quo.

> 
> > 3) spaces within URI
> > value="Q{http://e xample.com/nss/foo}bar"
> 
> I believe (but have not checked within the last several years) that the current
> specs for URIs and IRIs do not allow whitespace within either.  So I agree that
> in principle this should probably be disallowed.
> 
> But it's currently allowed by the XPath spec, unless i have missed something,
> so I did not try to make the schema disallow it.

As above: spaces are allowed by the production, so yes, they should be allowed.

> >
> > 8) no-namespace EQName (in "method", this should only be variants of
> > "Q{}html", i.e. the allowed defaults) value="Q{}html"
> > LibXML: invalid*
> > MSXML4: invalid*
> > MSXML.NET: invalid*
> > Saxon-EE: invalid*
> 
> I do not believe the spec intends for this to be valid.  Perhaps I'm wrong; I will
> have to reread the text.
> 
> Perhaps it should.
> 
> If it should be valid, is this a small enough change to make at this point?  Or is
> it too late?

I think the spec intends this to be valid. We say, in Serialization 3.1:

"An expanded QName with a non-null namespace URI, 
or with a null namespace URI and a local name equal to 
one of xml, xhtml, html, text, json, or adaptive"

The value Q{}html is a "null namespace URI" with a local name of "html".

The section further links to section 2 Basics in XP31, which introduces the production of EQName. I think that where we allow an EQName, and we allow fixed values in the no-namespace context, we, by extension, allow EQNames with no-namespace, meaning that Q{}html ought to be valid.

Note that in <xsl:output> and the like in XSLT we also allow this for the method-parameter, so I think it makes sense to allow it in the Serialization XSD. I don't think it is a change to the spec, I think it is only a fix of something that never made it to the XSD.

> 
> >
> > Findings:
> > - MSXML4 chokes on subtracting regexes, i.e. "[\c-[:]]", fixing that
> > by adding a hierarchy resolve it for MSXML4
> 
> Others will know better than I; is MSXML 4 currently in wide use?

Yes, it is. Even many .NET applications still invoke MSXML4, and many JavaScript applications rely on the existence of MSXML4.

> 
> Actually, I suppose my instinct is to try to make the schema work with it,
> even if it's not known to be currently in wide use.  Your sketches show a
> reasonably simple way.

Yes, I agree

> 
> > - The current expression "Q\{(.*)\}" can be made stricter to disallow
> > whitespace and curlies, or remove allowed whitespace by deriving by
> > restriction from a base type
> 
> Agreed as to the curly braces.  Unless we change the grammar of XPath, I am
> not persuaded as to the whitespace.

Me neither, not anymore, sorry for the confusion on whitespace, I misread and misunderstood the spec.

> 
> > - the no-namespace EQNames that are allowed in method-type and
> > json-node-output-method-type should be added
> 
> I'll have to think about this; I see a certain logic to it, but I don't see that logic
> in the spec.  Unless I am mistaken, this would require textual changes to the
> serialization spec as well as to the schema.

See above. I don't think it is a change to the Serialization spec.

> <snip />
> 
> They look equivalent to me, but I haven't tried to prove it.

This related to me making an incorrect regex, which was fixed in a later follow-up mail.

Received on Tuesday, 28 June 2016 11:48:00 UTC