RE: Behaviour of fn:string on sequences

> That is what I started to do, however in many places, such as 
> fn:string 
> function it is not clear where the missing (how it handles sequences) 
> information is. I know that it was not written to be dipped into, 
> however it would be nice if the functions paper had a few 
> more pointers 
> to the other papers. I tried to track it down by looking at the 
> Notations section to see what item? meant, however while that section 
> does have a reference to the Formal Semantics paper that is 
> to do with 
> return types, not how function calls are handled so I missed it.

I sympathize with your difficulties. When I started implementing Saxon, I
had to work from the raw specs, and it was difficult then although the specs
were much smaller.
.
> > 
> > That specification might feel like a good one to you, but 
> it would be
> > completely incompatible with XPath 1.0. The fact that 
> compatibility mode 
> > is off doesn't mean we can throw backwards compatibility out of the 
> > window: we have to make it feasible for people to move forwards.
> > 
> 
> Why is my suggestion any worse than the existing behaviour in non 
> compatability mode ? All that I am doing is changing an 
> operation that 
> would throw an exception into one that does something 
> reasonably useful.

From a compatibility point of view, throwing an error for something that
delivered a result in XPath 1.0 is much more acceptable than returning a
result that is completely different from the XPath 1.0 result. If we throw
an error, the user can see that he has to make a change and where the change
is needed. If we deliver different results, then the data published on the
user's web site will contain the wrong numbers, and no-one knows why.
> 
> Take its use within XSLT, I can do the following to check 
> that the node 
> set referred to by $var is empty.
>      <xsl:if test="$var">....</xsl:if>
> 
> If however I wanted to write a template that could handle 
> sequences and 
> nodes then I could not use the above pattern as if I am given 
> a sequence 
> of plain types then the above will either return true if it has more 
> than one item, return true or false if it is one item that can be 
> converted to a boolean, throw an exception if it has one item 
> that can 
> not be converted to a boolean, or return false if it is empty.
> 
> That behaviour will of course not break any existing XSLTs (assuming 
> that other functions or operations have not been changed to return 
> sequences of atomic values as opposed to sequences of nodes). 
> However, 
> the fact that the above pattern which is a very commonly used one has 
> this strange behaviour will cause problems.

Yes, people may fall into this trap. But I can't see an alternative design
that would have worked better. They will just have to learn to use empty().

> 
> Actually I think that any function that can only take a 
> single item will 
>   have similar strange behaviour.

If the function signature says that the function takes a single item, and
you give it a sequence, then you get an error, except in compatibility mode,
when the sequence is truncated. The issue with the string() and boolean()
functions is that they are heavily overloaded and that we had to define them
in a way that was backwards compatible.

> The reason that a single 
> item is the 
> > same as a sequence of length one in the XPath model is 
> because that's 
> > the way it is in XML Schema.
> > 
> 
> I don't understand what you mean in the last sentence about 
> XML Schema, 
> could you explain that a bit more as I would like to understand the 
> benefits of doing this as at the moment I can only see 
> disadvantages. 

In schema, if you define a simple type to contain a single integer, this is
exactly the same as saying it takes a sequence of integers with minOccurs=1
and maxOccurs=1. We don't want to treat differently an attribute containing
a single integer that can only ever hold a single integer, from an attribute
containing a single integer that could also contain zero integers or two
integers. So we have reflected this in the XPath model: an integer is the
same thing as a sequence of one integer. Similarly, the expression
child::BOOK returns a sequence of nodes; we don't want to treat the case
where it returns a single book specially.

> To me the sequence in XML Schema and sequence in XPath are not the same 
> although they are similar.

That's true.
> 
> I think that it would be much more logical if sequences were distinct 
> from singleton items. I don't know what all the consequences 
> would be of 
> doing this are.

It would be be a different language, I'm afraid. Until someone designs that
language, I won't judge how it compares. But we've spent three years
designing this language and I'm not going to start designing a different one
in five minutes, I will leave that pleasure to someone else.

Michael Kay

Received on Friday, 24 October 2003 12:39:10 UTC