- From: Mike Schilling <mschilling@edgility.com>
- Date: Mon, 21 Jan 2002 17:10:46 -0500 (EST)
- To: www-xpath-comments@w3.org, www-xquery-comments@w3.org
The following message is a courtesy copy of an article that has been posted to edgility.eng.xml as well. I am writing as someone whose software uses XPath 1.0 as a syntax for recognizing and extracting information in XML documents. I. Incompatibilities: Incompatibilities introduced in XPath 2.0 will cause difficulties for us, in all of the obvious areas: Existing XPath strings no longer working correctly. Need to retrain customers in XPath 2.0 concepts where they differ from XPath 1.0 concepts. Fragility of XPath implementations where they attempt to retain 1.0 compatibility in problematical ways. While appendix D listing incompatibilities is extensive, it leaves out two of the most important: 1. Requiring path elements which match keywords to be escaped. This is unacceptable, since it makes an unbounded set of existing XPath expression invalid. It is also unacceptable since it leaves open the possibility that, as future XPath versions will introduce new keywords, yet more XPath expressions become invalid. In fact, it amounts to a requirement that, for safety, every name in every path expression be escaped. This is a fundamental change to XPath syntax. If keywords are required, a much better choice is to make them distinguishable from path elements, i.e. forbid them from matching the XML Name production. Requiring them to begin with a colon (like the proposed, unacceptable syntax for path element escaping) is one possibility. 2. The introduction of the for statement This changes XPath from an expression-matching language to a pseudo-procedural one. It's quite unclear why "for" and "return" are included, but not "if" and "while". It's also unclear how XPath beneifts from implementing half of the XQuery FLWR statement. The examples given for "for" in the spec are quite unconvincing, since they describe the sort of transformation which is the province of XSLT and XQuery, not XPath. This will make user training in XPath far more difficult, since it breaks the existing user view of XPath as a pure pattern-matching language. Note that every incompatibility introduces increases the likelihood either that XPath will split into dialects or that XPath 2.0 will simply be rejected. The history of SQL 3.0 standard is a lesson about the limited ability of a standards effort to make fundamental changes to an existing language. II. Missing functionality. Member-wise operations on sequences are both natural and extermely useful. Take the requirement 1. Given an XML document containing a purchase order and its line <item> elements, calculate the total amount of the purchase order by summing the price times the quantity of each item. The nodeset is identified by item, and the expression to sum would be price * quantity. taken from section 2.5: Should Support Aggregation Functions Over Collection-Valued Expressions in http://www.w3.org/TR/xpath20req#section-Requirements A sample document fragment might be <items> <item partNum="872-AA"> <productName>Lawnmower</productName> <quantity>1</quantity> <USPrice>148.95</USPrice> <comment>Confirm this is electric</comment> </item> <item partNum="926-AA"> <productName>Baby Monitor</productName> <quantity>1</quantity> <USPrice>39.98</USPrice> <shipDate>1999-05-21</shipDate> </item> </items> A natural way to express this, which does not require the for statment, is sum(items/quantity * items/USPrice) In fact, when my co-workers and I were first learning XPath, we had to read the spec carefully to convince ourselves that this wasn't correct. (The restriction is understandable in XPath 1.0, where nodesets cannot be freely constructed. It is not understandable in 2,0, where sequences construction is explicitly supported.) It's clear that the current XPath definition of items/quantity * items/USPrice the product of the first node of each nodeset, is useless. A good XPath expression construction tool will warn the user that it almost certainly is not what is desired. But why redefine it as an error (as XPath 2.0 does) when it has an obvious and useful meaning? The semantics are simple: an n-ary operation on n sequences is allowed if all sequences have the same length m, and is interpreted as being done memberwise and resulting in a sequence of m members. Type exceptions are generated as if the operations are done from first member to m'th member.
Received on Monday, 21 January 2002 19:58:52 UTC