RE: Comments on XPath Streamable Subset

Michael,

It would be definitely nice to have a common subset across many specifications. We did consider the XPath subset in XML schema but it was too limited, it didn't cover all the use cases that we had. For example it doesn't allow Predicates, which are very commonly used to identify portions of the document e.g. /book/chapter[3] . And the XSLT matches, as you mentioned , are not really streamable because they allow arbitrary Predicates that can use backward axes.

What we really need is an XPath profile that covers all the XPath expressions that one would normally learn in a 30 minute XPath tutorial, with the constraint that this profile should be streamable. There is nothing like that currently, so we had to invent one.  But we could consider making this more generic and reusable by other specifications. One idea we had briefly considered was having multiple concentric subsets, but that was really getting more confusing. Also at one point we also investigated making a common profile shared between XML Signature and WS Transfer, that didn't work out either.


Regarding your specific questions about the profile.  The inclusion of ".." was a oversight, we intended to remove all "backward" axes like parent, ancestor, preceding and preceding-sibling.  But keep all the forward axes - child, descendant, forward and forward-sibling. This is because streaming goes one pass in forward direction only.

Pratik








-----Original Message-----
From: Michael Kay [mailto:mike@saxonica.com] 
Sent: Tuesday, October 05, 2010 7:46 AM
To: public-xmlsec@w3.org; w3c-xsl-query@w3.org
Subject: Comments on XPath Streamable Subset

  These are comments on

http://www.w3.org/TR/2010/WD-xmldsig-xpath-20100831/

XML Signature Streaming Profile of XPath 1.0

As I'm sure you are aware, you are not the only people interested in 
streamed processing of XPath expressions. For example, XML Schema 1.0 
also defined a subset of XPath 1.0 for use in XSD integrity constraints, 
and the major factor influencing the design of this subset was 
streamability. (The other factor was perhaps perceived cost of 
implementation.)

In a sense, the XSLT "match pattern" syntax, another simple subset of 
XPath, was also designed with similar factors in mind, though it 
actually allows the use of unconstrained XPath expressions in predicates 
and is therefore not fully streamable.

The XSL WG has been working for the last couple of years on streaming 
facilities for XSLT, which of course includes streamable XPath 
expressions and streamable XSLT match patterns within the language.

Clearly there is a great deal of commonality in all these different 
efforts, but also a large number of arbitrary differences in the solutions.

I haven't fully understood the rationale behind some of the choices you 
have made, such as allowing the following and following-sibling axes 
while disallowing preceding and preceding-sibling; or why you allow the 
abbreviated step ".." but not the explicit use of the parent axis. 
Perhaps this indicates that you have a processing model in mind that 
hasn't been clearly articulated.

But the main substance of my comment (endorsed by the XQuery and XSL 
WGs) is not on the technical detail, but on the procedural question: W3C 
is producing far too many specifications that contain a variant, subset, 
or profile of XPath, and this cannot be in the interests of the user or 
the implementor, both of whom are typically working with many products 
and many specifications at the same time. So this is a plea for 
coordination.

Michael Kay

Received on Wednesday, 6 October 2010 01:19:18 UTC