- From: Don Chamberlin <chamberl@almaden.ibm.com>
- Date: Mon, 19 Jan 2004 18:27:23 -0800
- To: public-qt-comments@w3.org
- Cc: Andrew Eisenberg <andrew.eisenberg@us.ibm.com>
- Message-ID: <OFB41358D9.75D438BE-ON88256E21.000D0869-88256E21.000D8617@us.ibm.com>
This Last Call comment proposes a new XPath/XQuery function, possibly named fn:atom(). The result of fn:atom() is the same as that of fn:data(), except in those cases where fn:data() would construct a typed value by concatenating the string values of descendant elements. In these cases, fn:atom() returns an empty sequence. The definition of fn:atom() is as follows: fn:atom($arg as item()*) as xdt:anyAtomicType* The result of fn:atom is the sequence of atomic values produced by applying the following rules to each item in $arg: 1. If the item is an atomic value, it is returned. 2. If the item is a document node, the result is (). 3. If the item is an attribute, text, comment, processing instruction, or namespace node, the typed value of the node is returned (same as fn:data). 4. If the item is an element node (regardless of type annotation) that has no child element node, the typed value of the node is returned. 5. If the item is an element node (regardless of type annotation) that has a child element node, an empty sequence is returned. The advantages of the proposed function are as follows: (a) Suppose that $node1 and $node2 are element nodes with a type annotation that permits mixed content. Suppose that $node1 is bound to <node><a>1</a><b>2</b></node>, and $node2 is bound to <node><b>1</b><a>2</a></node>. Then data($node1) = data($node2) is true, because the typed value returned by the data function ignores nested markup. This is a undesirable result in certain applications. The atom() function would enable a user to avoid this anomaly by writing atom($node1) = atom($node2). In this case both atom() functions would return () and the result of the predicate would be false. (b) Suppose $sal is bound to the following untyped node: <salary><base>17</base><bonus>25</bonus></salary>. Consider the predicate $sal > 300. In processing this query, the data function will be implicitly invoked and will return the untyped atomic value "1725". Comparing this value with 300 will implicitly cast it to the integer 1725. The value of the predicate is true. This anomaly could be avoided by using the atom function, as follows: atom($sal) > 300. This revised predicate is false. (c) The fn:atom function can be used to search for elements that have a given atomic value without incurring the cost of concatenating potentially huge volumes of data. For example, consider the query: /book[title = "War and Peace"]//*[atom(.) = "Leo Tolstoy"] This query can quickly find leaf nodes with the atomic content "Leo Tolstoy" without computing the string-values of all the intermediate nodes in the tree, some of which are very large (the topmost element contains the whole book!) We believe that this function is sufficiently important to justify including it in the standard function library in order to make applications portable. Cheers, --Don Chamberlin
Received on Monday, 19 January 2004 21:27:30 UTC