W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > May 2007

Re: can we have last() having a consistent value ?

From: Innovimax SARL <innovimax@gmail.com>
Date: Thu, 24 May 2007 20:22:10 +0200
Message-ID: <546c6c1c0705241122v19584520v448d1ecd7c4f607@mail.gmail.com>
To: "Norman Walsh" <ndw@nwalsh.com>
Cc: public-xml-processing-model-wg@w3.org

On 5/24/07, Norman Walsh <ndw@nwalsh.com> wrote:
> / Innovimax SARL <innovimax@gmail.com> was heard to say:
> | I think that is the last thing I want to have a consistent story
> | around before joining the Dark Side
> |
> | I hear Norm, and he is true, that WE MUST DEFINE A CONTEXT
> |
> | My point is, CAN WE MAKE IT THE MOST CONSISTENT AS POSSIBLE
> |
> | XPath says :
> |
> | [[
> | Expression evaluation occurs with respect to a context. XSLT and
> | XPointer specify how the context is determined for XPath expressions
> | used in XSLT and XPointer respectively. The context consists of:
> |
> |    * a node (the context node)
> |    * a pair of non-zero positive integers (the context position and
> | the context size)
> |    * a set of variable bindings
> |    * a function library
> |    * the set of namespace declarations in scope for the expression
> |
> | The context position is always less than or equal to the context size..
> | ]]
> |
> | Some thoughts
>
> Here are the options as I see them:
>
> 1. We use context position and context size and we make them be
> correct. The context size is the number of documents in the sequence
> and the context position is the number of the document in that
> sequence.
>
> 2. We use context position and context size and we accept the fact
> that we can't reliably set the context size, so we always make it
> equal to the context position. (Or we always make it MAXINT; I don't
> much care.)
>
> 3. We don't use context position and context size and we always
> set them both to "1". We use a separate extension function to return
> the number of the document in the sequence.
>
> Only in case 1 do we get complete consistency. But that totally
> prevents a streaming implementation and requires (possibly massive
> amounts of) buffering. I don't think it'll be difficult to persuade
> our users that this is an unattractive option. (What's more, if they
> have a step that actually really needs to know how many documents
> are in the sequence, they can compute it with p:count.)

This is just exactly true
In a p:for-each, you cannot have an equivalent of last() because you
will never at the previous step know how many match there will be.

Just take <p:for-each select="//*[@xml:id]"> in an sequence of
enormous documents as an example


>
> In case 2, we get consistency in the sense that, by analogy with the
> xsl:for-each case, the context position (and the position() function)
> will count the documents, just as a user would expect. We get some
> inconsistency because the last() function returns a value that's
> possibly smaller than the actual case will ultimately turn out to be.
>
> In case 3, we get literal consistency in the sense that always
> returning "1" is consistent. However, whether or not this is more or
> less consistent with the users view is an open question. A user who
> imagines that processing a sequence of documents with a step is
> analagous to performing an xsl:for-each operation over the sequence
> may be quite surprised to learn that position() returns "1" even after
> more than one document has been processed. And with respect to the
> context size, our answer is "consistent" but no more-or-less correct
> than in case 2 since last() will always return "1" even when there
> were 100 documents in the sequence.
>
> Given that in no case except 1 is the last() function useful, and
> given that the definition of p:position() has to exactly mirror what
> the user might have expected position() to be in case 2, I think it
> makes sense to use the context position.
>
> This also means that steps don't have to understand anything about
> extension functions.
>

Thank for this fair summarize

Beeing consistent for me is that

wether we have p:count (as a component) and p:position (as a variable
or a function) AND p:is-last-in-sequence (as a variable or a function)

or we have last() and position() (and we don't do streaming so ....)

having half of that is just a way to make things difficult to user to
admit the limitation of streaming as consistent

one other point no one has made is that I will never be able to use
position() in Pattern
<p:viewport match="p[p:position() &lt; $max-to-process]">
in not easily done with your proposal

Mohamed

-- 
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 8 72 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 
Received on Thursday, 24 May 2007 18:22:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:52 GMT