- From: Paul Duffin <pduffin@volantis.com>
- Date: Fri, 24 Oct 2003 10:43:27 +0100
- To: "Kay, Michael" <Michael.Kay@softwareag.com>
- Cc: public-qt-comments@w3.org
Thank you very much for the time you are taking to explain this to me, I do appreciate it but am finding it quite hard to get my head around this subject. Kay, Michael wrote: > > Is there an > > overview document that explains what all the different specifications > > are for and is a good place to start learning about it. At > > the moment I > > am blindly jumping between them looking for the bits and > > pieces that I need. > > The XPath language spec is the best place to start. Read that until you > find something you don't understand, then look in the other documents to > see if they help. > That is what I started to do, however in many places, such as fn:string function it is not clear where the missing (how it handles sequences) information is. I know that it was not written to be dipped into, however it would be nice if the functions paper had a few more pointers to the other papers. I tried to track it down by looking at the Notations section to see what item? meant, however while that section does have a reference to the Formal Semantics paper that is to do with return types, not how function calls are handled so I missed it. > > > > So let me get this straight (ignoring XPath 1.0 compatability mode). > > > > o An empty sequence results in an empty string. > > o A sequence with one item is the same as passing in a single item. > > o A sequence with more than one item is an error. > > > > That seems very inconsistent to me. Why does fn:string not take a > > sequence ? It could return a string consisting of the > > concatenation of > > the results of applying fn:string to each item in the sequence with > > possibly a space separator between them, although the > > separator could be > > supplied as an optional second parameter. > > That specification might feel like a good one to you, but it would be > completely incompatible with XPath 1.0. The fact that compatibility mode > is off doesn't mean we can throw backwards compatibility out of the > window: we have to make it feasible for people to move forwards. > Why is my suggestion any worse than the existing behaviour in non compatability mode ? All that I am doing is changing an operation that would throw an exception into one that does something reasonably useful. > > > > How do I get the string representation of a sequence ? > > Is there another function to do this ? > > Yes, the string-join function is there for this purpose. That is exactly what I want. > > > > On a similar note, the specification for fn:boolean says ".... If > > $srcval is an atomic value ....". However, fn:boolean takes a > > sequence > > so I presume that $srcval is an 'atomic value' if the > > sequence contains > > a single item. However, this behaviour does not appear to > > have any real > > meaning when applied to a sequence. > > I'm not sure what you mean by "real meaning". > Useful. Take its use within XSLT, I can do the following to check that the node set referred to by $var is empty. <xsl:if test="$var">....</xsl:if> If however I wanted to write a template that could handle sequences and nodes then I could not use the above pattern as if I am given a sequence of plain types then the above will either return true if it has more than one item, return true or false if it is one item that can be converted to a boolean, throw an exception if it has one item that can not be converted to a boolean, or return false if it is empty. That behaviour will of course not break any existing XSLTs (assuming that other functions or operations have not been changed to return sequences of atomic values as opposed to sequences of nodes). However, the fact that the above pattern which is a very commonly used one has this strange behaviour will cause problems. Of course, as you explained there is an empty() function that should be used instead but people will still use the above pattern because it is easier and works in most cases. As they move to take advantage of sequences in XPath 2.0 they will get bitten by this problem. Compatability is more than just the way that functions work, it also involves the patterns and idioms as well. The fact that empty() does not exist in XPath 1.0 forces me to use the above pattern > > e.g. I often want to be able to tell whether a sequence is empty. In > > XPath 1.0 if I gave it a node set then it would return false > > if it was > > empty and true otherwise. This is also how it behaves in XPath 2.0. > > However, if I give it a sequence of atomic types (as opposed to a > > sequence of nodes) then it does not tell me whether the sequence is > > empty as a non empty sequence containing a single atomic value could > > evaluate to false. > > Correct. In XPath 2.0 the values that return false when supplied as > arguments to the boolean function are exactly the same as the values > that returned false in XPath 1.0: an empty node-set, the boolean false, > a numeric zero, and the string "". > > If you want to test whether a sequence is empty, don't use the boolean() > function, use the empty() function. But a node-set is just a sequence in XPath 2.0 and fn:boolean works for that. > > > > I understand that this is a consequence of treating a sequence > > containing a single item and that item the same. However, is > > that really > > a good idea given the strange behaviour that it causes in relatively > > simple functions such as these two. > > I think the strangeness in these two functions is because they were both > carried forward from XPath 1.0 which had a different data model. This Actually I think that any function that can only take a single item will have similar strange behaviour. This would not be a problem if I was using that function explicitly but I do not have any real control over whether it is used or not. > doesn't make the data model wrong. The reason that a single item is the > same as a sequence of length one in the XPath model is because that's > the way it is in XML Schema. > I don't understand what you mean in the last sentence about XML Schema, could you explain that a bit more as I would like to understand the benefits of doing this as at the moment I can only see disadvantages. To me the sequence in XML Schema and sequence in XPath are not the same although they are similar. I could use XPath to retrieve the child nodes of an element as a sequence of nodes irrespective of whether the XML Schema for that document defines the content of that element as a sequence, choice, or combination of both. Just because a node could be treated as a sequence of one node does not mean that it is. The definition of a sequence says that it cannot contain other sequences and yet as it contains individual items and those items are identical to a sequence containing just that one item then in effect a sequence can contain sequences. In fact to paraphrase it is sequences all the way down. I think that it would be much more logical if sequences were distinct from singleton items. I don't know what all the consequences would be of doing this are. You would of course need a way (function) to extract items from sequences and create a sequence from a single item but I am sure that there would be more. This could also allow better static type checking as it would not be possible to pass a sequence to a function that can only take an item. At the moment that check has to be done at runtime. Operators and functions that work on sequences and want to work on items could be easily overloaded to work on a single item (convert that item into a sequence and then call the sequence version.) The effect on fn:boolean would be it could distinguish between a sequence and a single item and return true if the sequence was not empty irrespective of what was inside it and false if it was not. If it was an item then it could cast it to an xs:boolean. As far as I can see this model will not introduce any backwards compatability issues of itself as long as the appropriate functions and operators were updated
Received on Friday, 24 October 2003 05:44:33 UTC