- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Tue, 22 May 2007 22:00:33 +0100
- To: public-xml-processing-model-wg@w3.org
Norman Walsh wrote:
> Position is always 1 in for-each and viewport because they're odd cases.
> Consider this example instead:
>
> <p:matching-documents>
> <p:input port="source">
> <p:inline>
> <odd/>
> </p:inline>
> <p:inline>
> <even/>
> </p:inline>
> <p:inline>
> <odd/>
> </p:inline>
> <p:inline>
> <even/>
> </p:inline>
> <p:inline>
> <odd/>
> </p:inline>
> </p:input>
> <p:option name="test" value="$p:position mod 2 = 1"/>
> </p:matching-documents>
>
> It returns all the "<odd/>" documents.
>
> I think this example also illustrates why it would be a mistake to
> overload XPath's built-in position() function. A complex expression
> might use position() in a more natural context and it would be confusing
> (though I grant not explicitly illegal) to have position() used in two
> different ways in the same expression.
I really don't understand this example. You seem to be suggesting that
there's some implicit iteration happening, that the component is passed:
* a single document on the 'source' port (with the document element
<odd>)
* the option 'test' with the value '$p:position mod 2 = 1'
* a set of variable bindings that includes the binding of $p:position
to 1
then,
* a single document on the 'source' port (with the document element
<even>)
* the option 'test' with the value '$p:position mod 2 = 1'
* a set of variable bindings that includes the binding of $p:position
to 2
and so on.
In this explanation, the pipeline processor is in charge of the
iteration, supplies a different value of $p:position each time, and
therefore you only get the <odd> elements.
I thought that we'd decided not to allow implicit iteration, (a) because
we run into huge problems if we have steps that have more than one port
that accepts sequences and (b) because it means you can't have steps
that do useful things like count the number of documents passed on a port.
So assuming that there *isn't* implicit iteration, this is what I think
happens: the p:matching-documents component gets passed:
* a sequence of documents on the 'source' port (with the document
elements <odd> and <even>)
* the option 'test' with the value '$p:position mod 2 = 1'
* a set of variable bindings, which presumably includes the binding
of $p:position to some number, let's say 1.
The component iterates through the documents on the 'source' port, and
for each one evaluates the XPath "$p:position mod 2 = 1". Since the
value of $p:position is bound to 1, this is always true. Therefore the
component returns all the documents on the 'source' port.
I think that p:matching-documents should say:
The XPath expression supplied as the value of the 'test' option is
evaluated for each document supplied on the 'source' port, with the
following context:
* the context node is the document itself
* the context position is the position of the document in the sequence
supplied to the 'source' port
* the context size is the number of documents in the sequence supplied
to the 'source' port
* the variable bindings are those supplied to the step
* the function library is the core XPath function library
* the namespace declarations are those supplied to the step
and that using
<p:option name="test" value="position() mod 2 = 1" />
will work just fine.
The only issue that I can see is:
> I still think it would be wrong. Not only for the reason I gave just
> above but also because it might encourage people to believe that they
> could use last() and the streaming folks explicitly don't want to have
> to support that.
To get around this, the p:matching-documents step *could* say:
* the context size is the number of documents in the same as the
context position
This would mean that users could still use last() but not in any
meaningful way.
But the streaming folks are going to have to do some analysis of XPath
expressions anyway, since not all of them are going to be streamable,
and one of the tests could be whether the expression contains the last()
function or not. If it doesn't, as in the above example, streaming is
still a possibility; if it does, then it can't stream.
Cheers,
Jeni
--
Jeni Tennison
http://www.jenitennison.com
Received on Tuesday, 22 May 2007 21:00:37 UTC