Re: Behaviour of fn:string on sequences

Kay, Michael wrote:
>  From a compatibility point of view, throwing an error for something 
> that delivered a result in XPath 1.0 is much more acceptable than 
> returning a result that is completely different from the XPath 1.0 
> result. If we throw an error, the user can see that he has to make a 
> change and where the change is needed. If we deliver different results, 
> then the data published on the user's web site will contain the wrong 
> numbers, and no-one knows why.
> 

Good point.

>  >
[snip]
> 
> Yes, people may fall into this trap. But I can't see an alternative 
> design that would have worked better. They will just have to learn to 
> use empty().
> 

Distinguishing between a single item and a sequence containing a single 
item would address this weird behaviour. However, it would almost 
certainly cause problems in other areas.

>  >
[snip]
> 
> In schema, if you define a simple type to contain a single integer, this 
> is exactly the same as saying it takes a sequence of integers with 
> minOccurs=1 and maxOccurs=1.

I agree that the set of values they can take are the same but that does 
not necessarily mean that the types *have* to be equivalent. There are 
many cases where I can constrain one type so that it supports the same 
set of values as another but that does not mean that the types are 
equivalent.

I accept that for convenience sake it might be better to make your 
example types equivalent but that equivalence should not be built into 
sequences.

The following is just me working through how accessing an attribute works.

Assuming that we have two attributes, foo and bar, where foo is of type 
xs:integer and bar is a sequence of integers constrained to contain only 
one item. You want the following two expressions to be equivalent.

     @foo
     @bar

As I understand it the above are equivalent to
     attribute::foo
     attribute::bar

which both return a node set containing the attribute node.

Applying fn:data to the node set would give a sequence containing a 
single item of type xs:integer. As sequences cannot be nested it is 
impossible to distinguish between them.

So in order to distinguish between sequences and items as I would like 
it would also be necessary to allow sequences to be nested.

Question: Why is it not possible to nest sequences ?

 > We don't want to treat differently an
> attribute containing a single integer that can only ever hold a single 
> integer, from an attribute containing a single integer that could also 
> contain zero integers or two integers.

Why not. When I am processing an attribute that could contain a sequence 
of multiple values I have to be prepared to cope with there being more 
than one item in that sequence. In the current situation I could easily 
write XPath expressions that do what I want when given a sequence with 
one item but would fail when given a sequence containing two. This will 
have a detrimental effect on the robustness of applications containing 
XPath expresions.

It is similar to shell script quoting. Because it is only necessary to 
quote arguments if they contain spaces in and most arguments to shell 
scripts do not contain spaces there are thousands if not millions of 
shell scripts that will fail when given arguments with spaces in.

I understand that an XPath 2.0 application will behave differently if 
the DOM it is being applied against has been validated against a schema 
and hence has type information (PSVI) compared to when it has not been 
validated and hence does not have PSVI.

> So we have reflected this in the 
> XPath model: an integer is the same thing as a sequence of one integer. 
> Similarly, the expression child::BOOK returns a sequence of nodes; we 
> don't want to treat the case where it returns a single book specially.
> 

As you say child::BOOK contains a sequence of nodes, if there is only 
one node that matches then the sequence contains one node. Predicates 
create a sequence of nodes from another sequence of nodes so even [1] 
will return a sequence.

The treatment of sequences of nodes, at least by fn:boolean is 
consistent, the problem arises with sequences of atomic items.

>  > To me the sequence in XML Schema and sequence in XPath are not the same
>  > although they are similar.
> 
> That's true.

So why are you forcing XPath sequences to have the same semantics as XML 
Schema sequences.

>  >
>  > I think that it would be much more logical if sequences were distinct
>  > from singleton items. I don't know what all the consequences
>  > would be of
>  > doing this are.
> 
> It would be be a different language, I'm afraid. Until someone designs 
> that language, I won't judge how it compares. But we've spent three 
> years designing this language and I'm not going to start designing a 
> different one in five minutes, I will leave that pleasure to someone else.
> 

I am trying to use XPath 2.0 as a standalone expression language that 
has nothing to do with either XQuery, or XSLT. The reason I am doing 
this is that it is standard, and is being used in other W3C standards, 
such as XForms. The fact that sequences are treated in this way does 
make it much harder to use.

Question: The XPath 2.0 specification seems to be driven by XQuery and 
possibly XSLT, what other W3C WG are involved in it ? XForms ? DIWG ?

Received on Friday, 31 October 2003 07:54:41 UTC