Re: New static error: options in the XProc namespace

/ Jeni Tennison <jeni@jenitennison.com> was heard to say:
| Norman Walsh wrote:
|> Position is always 1 in for-each and viewport because they're odd cases.
|> Consider this example instead:
|>
|> <p:matching-documents>
|>   <p:input port="source">
|>     <p:inline>
|>       <odd/>
|>     </p:inline>
|>     <p:inline>
|>       <even/>
|>     </p:inline>
|>     <p:inline>
|>       <odd/>
|>     </p:inline>
|>     <p:inline>
|>       <even/>
|>     </p:inline>
|>     <p:inline>
|>       <odd/>
|>     </p:inline>
|>   </p:input>
|>   <p:option name="test" value="$p:position mod 2 = 1"/>
|> </p:matching-documents>
|>
|> It returns all the "<odd/>" documents.
|>
|> I think this example also illustrates why it would be a mistake to
|> overload XPath's built-in position() function. A complex expression
|> might use position() in a more natural context and it would be confusing
|> (though I grant not explicitly illegal) to have position() used in two
|> different ways in the same expression.
|
| I really don't understand this example. You seem to be suggesting that there's
| some implicit iteration happening, that the component is passed:

All of the steps that accept a sequence do iterate implicitly. We simply
don't pass sequences to very many steps.

|   * a single document on the 'source' port (with the document element <odd>)
|   * the option 'test' with the value '$p:position mod 2 = 1'
|   * a set of variable bindings that includes the binding of $p:position to 1
|
| then,
|
|   * a single document on the 'source' port (with the document element <even>)
|   * the option 'test' with the value '$p:position mod 2 = 1'
|   * a set of variable bindings that includes the binding of $p:position to 2
|
| and so on.

Yes, whether we call it $p:position or p:position(), I expect
p:matching-documents and other steps that evaluate an expression in the
context of a sequence to keep track of the number of documents that have
passed by and return the position correctly.

| In this explanation, the pipeline processor is in charge of the iteration,
| supplies a different value of $p:position each time, and therefore you only get
| the <odd> elements.

The pipeline processor doesn't do any iteration, it just passes the
sequence to the step.

| I thought that we'd decided not to allow implicit iteration, (a) because we run
| into huge problems if we have steps that have more than one port that accepts
| sequences and (b) because it means you can't have steps that do useful things
| like count the number of documents passed on a port.

Right. The pipeline processor is not doing the iteration, the step is,
and yes, I had thought that meant the step had to keep track of some
evolution in its XPath context.

| So assuming that there *isn't* implicit iteration, this is what I think happens:
| the p:matching-documents component gets passed:
|
|   * a sequence of documents on the 'source' port (with the document elements
| <odd> and <even>)
|   * the option 'test' with the value '$p:position mod 2 = 1'
|   * a set of variable bindings, which presumably includes the binding of
| $p:position to some number, let's say 1.
|
| The component iterates through the documents on the 'source' port, and for each
| one evaluates the XPath "$p:position mod 2 = 1". Since the value of $p:position
| is bound to 1, this is always true. Therefore the component returns all the
| documents on the 'source' port.

I can see the appeal of that interpretation, but in that case I don't
think p:position() is useful enough to bother implementing.

| I think that p:matching-documents should say:
|
|   The XPath expression supplied as the value of the 'test' option is
|   evaluated for each document supplied on the 'source' port, with the
|   following context:
|
|   * the context node is the document itself
|   * the context position is the position of the document in the sequence
|     supplied to the 'source' port
|   * the context size is the number of documents in the sequence supplied
|     to the 'source' port
|   * the variable bindings are those supplied to the step
|   * the function library is the core XPath function library
|   * the namespace declarations are those supplied to the step
|
| and that using
|
|   <p:option name="test" value="position() mod 2 = 1" />
|
| will work just fine.

Two things:

First, it seems to me that the step has to update the context position
as it iterates through the sequence and I don't see how that's any
different than updating the p:position().

Second, while all of the off-the-shelf XPath processors that I've looked
at offer some mechanism for calling extension functions (or setting
variables, though we've opted not to go that way, I think), I'm much
less confident that they provide a mechanism for changing the internal
context position.

While I concede that the number of documents can be seen as the context
position in one sense, I remain of the opinion that that's not very
XPath-1-like and would prefer to use an extension function.

| The only issue that I can see is:
|> I still think it would be wrong. Not only for the reason I gave just
|> above but also because it might encourage people to believe that they
|> could use last() and the streaming folks explicitly don't want to have
|> to support that.
|
| To get around this, the p:matching-documents step *could* say:
|
|   * the context size is the number of documents in the same as the
|     context position
|
| This would mean that users could still use last() but not in any meaningful way.
|
| But the streaming folks are going to have to do some analysis of XPath
| expressions anyway, since not all of them are going to be streamable, and one of
| the tests could be whether the expression contains the last() function or not.
| If it doesn't, as in the above example, streaming is still a possibility; if it
| does, then it can't stream.

I think there was strong resistance to supporting last() or p:last()
when we discussed this before.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | He that will not apply new remedies
http://nwalsh.com/            | must expect new evils; for time is the
                              | great innovator.--Sir Francis Bacon

Received on Wednesday, 23 May 2007 11:37:29 UTC