RE: Comments on XPath 2.0 specification from Kay, Michael on 2002-01-22 (www-xpath-comments@w3.org from January to March 2002)

From: Kay, Michael <Michael.Kay@softwareag.com>
Date: Tue, 22 Jan 2002 14:56:49 +0100
To: "'Mike Schilling'" <mschilling@edgility.com>, www-xpath-comments@w3.org
Message-ID: <DFF2AC9E3583D511A21F0008C7E6210622B927@daemsg02.software-ag.de>
Thanks for making these comments. They will be considered by the working
groups, but before that happens, let me make some personal observations.

> 
> While appendix D listing incompatibilities is extensive, it 
> leaves out 
> two of the most important:
> 
> 1. Requiring path elements which match keywords to be escaped.
> This is unacceptable, since it makes an unbounded set of 
> existing XPath expression invalid.

Perhaps the spec here isn't clear enough. We actually went to great lengths
to ensure that XPath does not require any reserved words. In this respect
XPath differs from XQuery, which is a much richer language and does have
reserved keywords. The escaping convention (of preceding a QName with a ":")
is available in both languages so that you can always play safe by escaping
everything if you want to be sure that the expression will be valid in both
languages (I would only expect this to be done when the expressions are
software-generated).
> 
> 2. The introduction of the for statement
> This changes XPath from an expression-matching language to a 
> pseudo-procedural one.  It's quite unclear why "for" and "return" are 
> included, but not "if" and "while".

As you can imagine, there was a lot of debate about this (and there might
well be more, since there have been a number of well-argued comments on the
subject). You could regard the current position as a compromise, including
that subset of XQuery FLWR expressions that were felt to be essential to
make proper use of sequences in the data model (remember that elements and
attributes in XML Schema can be sequence-valued, and constructs are needed
for manipulating these sequences), without having quite the full complexity
of full FLWR expressions. Conditional expressions (if) are included, but
"while" isn't, because you can always rewrite a "while" as an "if".

I don't think that "for" and "if" expressions are pseudo-procedural, though
perhaps the choice of syntax makes them look that way. If we had used a more
mathematical notation, for example

  EXP1 as $e => EXP2
instead of
  for $e in EXP1 return EXP2
and
  (a ? b ! c)
instead of
  if (a) then b else c
then perhaps the impression of procedural semantics would have been avoided;
but it's a false impression either way. XPath remains an expression
language, it has simply been extended to handle expressions over sequences
as well as over booleans, numbers, and strings. 

> It's also unclear how XPath 
> beneifts from implementing half of the XQuery FLWR statement.  The 
> examples given for "for" in the spec are quite unconvincing, 
> since they 
> describe the sort of transformation which is the province of XSLT and 
> XQuery, not XPath.

We could have implemented the equivalent of FLWR expressions in XSLT (in
fact there have been comments on the XSLT 2.0 draft suggesting we should
have done that). There are arguments both ways, and we are reviewing the
decision. The main advantage of doing it the way we have done is that it
maximizes the commonality between XSLT and XQuery, which we feel is in the
long-term interests of both implementors and users.
> 
> Note that every incompatibility introduces increases the likelihood 
> either that XPath will split into dialects or that XPath 2.0 
> will simply be rejected.

We are acutely aware of this risk. You should be aware that some of the
incompatibilities listed in the document are deliberate choices where we
felt that the gain exceeded the pain; others are accidents of the
specification that we only discovered at a fairly late stage and intend to
review.
> 
> II. Missing functionality.
> 
> Member-wise operations on sequences are both natural and extermely 
> useful.  Take the requirement
> 
> 	1. Given an XML document containing a purchase order and its
> 	line <item> elements, calculate the total amount of the
> 	purchase order by summing the price times the quantity of each
> 	item. The nodeset is identified by item, and the expression to
> 	sum would be price * quantity.
> 
> 
> taken from
> 
> 	section 2.5:  Should Support Aggregation Functions Over
> 	Collection-Valued Expressions
> 
> in
> 
> 	http://www.w3.org/TR/xpath20req#section-Requirements
> 
> A sample document fragment might be
> 
> <items>
>    <item partNum="872-AA">
>      <productName>Lawnmower</productName>
>      <quantity>1</quantity>
>      <USPrice>148.95</USPrice>
>      <comment>Confirm this is electric</comment>
>    </item>
>    <item partNum="926-AA">
>      <productName>Baby Monitor</productName>
>      <quantity>1</quantity>
>      <USPrice>39.98</USPrice>
>      <shipDate>1999-05-21</shipDate>
>    </item>
> </items>
> 	
> A natural way to express this, which does not require the for 
> statment, is
> 
> 	sum(items/quantity * items/USPrice)

There are many good reasons, I think, for not adopting the "dot product"
semantics you suggest. One of them is simply that it's incompatible with
XPath 1.0.

Mike Kay
Received on Tuesday, 22 January 2002 08:56:55 UTC