Re: New static error: options in the XProc namespace

Norman Walsh wrote:
> Position is always 1 in for-each and viewport because they're odd cases.
> Consider this example instead:
> 
> <p:matching-documents>
>   <p:input port="source">
>     <p:inline>
>       <odd/>
>     </p:inline>
>     <p:inline>
>       <even/>
>     </p:inline>
>     <p:inline>
>       <odd/>
>     </p:inline>
>     <p:inline>
>       <even/>
>     </p:inline>
>     <p:inline>
>       <odd/>
>     </p:inline>
>   </p:input>
>   <p:option name="test" value="$p:position mod 2 = 1"/>
> </p:matching-documents>
> 
> It returns all the "<odd/>" documents.
> 
> I think this example also illustrates why it would be a mistake to
> overload XPath's built-in position() function. A complex expression
> might use position() in a more natural context and it would be confusing
> (though I grant not explicitly illegal) to have position() used in two
> different ways in the same expression.

I really don't understand this example. You seem to be suggesting that 
there's some implicit iteration happening, that the component is passed:

   * a single document on the 'source' port (with the document element 
<odd>)
   * the option 'test' with the value '$p:position mod 2 = 1'
   * a set of variable bindings that includes the binding of $p:position 
to 1

then,

   * a single document on the 'source' port (with the document element 
<even>)
   * the option 'test' with the value '$p:position mod 2 = 1'
   * a set of variable bindings that includes the binding of $p:position 
to 2

and so on.

In this explanation, the pipeline processor is in charge of the 
iteration, supplies a different value of $p:position each time, and 
therefore you only get the <odd> elements.

I thought that we'd decided not to allow implicit iteration, (a) because 
we run into huge problems if we have steps that have more than one port 
that accepts sequences and (b) because it means you can't have steps 
that do useful things like count the number of documents passed on a port.

So assuming that there *isn't* implicit iteration, this is what I think 
happens: the p:matching-documents component gets passed:

   * a sequence of documents on the 'source' port (with the document 
elements <odd> and <even>)
   * the option 'test' with the value '$p:position mod 2 = 1'
   * a set of variable bindings, which presumably includes the binding 
of $p:position to some number, let's say 1.

The component iterates through the documents on the 'source' port, and 
for each one evaluates the XPath "$p:position mod 2 = 1". Since the 
value of $p:position is bound to 1, this is always true. Therefore the 
component returns all the documents on the 'source' port.

I think that p:matching-documents should say:

   The XPath expression supplied as the value of the 'test' option is
   evaluated for each document supplied on the 'source' port, with the
   following context:

   * the context node is the document itself
   * the context position is the position of the document in the sequence
     supplied to the 'source' port
   * the context size is the number of documents in the sequence supplied
     to the 'source' port
   * the variable bindings are those supplied to the step
   * the function library is the core XPath function library
   * the namespace declarations are those supplied to the step

and that using

   <p:option name="test" value="position() mod 2 = 1" />

will work just fine.

The only issue that I can see is:
> I still think it would be wrong. Not only for the reason I gave just
> above but also because it might encourage people to believe that they
> could use last() and the streaming folks explicitly don't want to have
> to support that.

To get around this, the p:matching-documents step *could* say:

   * the context size is the number of documents in the same as the
     context position

This would mean that users could still use last() but not in any 
meaningful way.

But the streaming folks are going to have to do some analysis of XPath 
expressions anyway, since not all of them are going to be streamable, 
and one of the tests could be whether the expression contains the last() 
function or not. If it doesn't, as in the above example, streaming is 
still a possibility; if it does, then it can't stream.

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com

Received on Tuesday, 22 May 2007 21:00:37 UTC