Re: can we have last() having a consistent value ?

/ "Innovimax SARL" <innovimax@gmail.com> was heard to say:
| On 5/24/07, Norman Walsh <ndw@nwalsh.com> wrote:
|> Here are the options as I see them:
|>
|> 1. We use context position and context size and we make them be
|> correct. The context size is the number of documents in the sequence
|> and the context position is the number of the document in that
|> sequence.
|>
|> 2. We use context position and context size and we accept the fact
|> that we can't reliably set the context size, so we always make it
|> equal to the context position. (Or we always make it MAXINT; I don't
|> much care.)
|>
|> 3. We don't use context position and context size and we always
|> set them both to "1". We use a separate extension function to return
|> the number of the document in the sequence.
|>
|> Only in case 1 do we get complete consistency. But that totally
|> prevents a streaming implementation and requires (possibly massive
|> amounts of) buffering. I don't think it'll be difficult to persuade
|> our users that this is an unattractive option. (What's more, if they
|> have a step that actually really needs to know how many documents
|> are in the sequence, they can compute it with p:count.)
|
| This is just exactly true
| In a p:for-each, you cannot have an equivalent of last() because you
| will never at the previous step know how many match there will be.
|
| Just take <p:for-each select="//*[@xml:id]"> in an sequence of
| enormous documents as an example

As Jeni just pointed out, we could insist that you count them if
someone asked for last().

|> In case 2, we get consistency in the sense that, by analogy with the
|> xsl:for-each case, the context position (and the position() function)
|> will count the documents, just as a user would expect. We get some
|> inconsistency because the last() function returns a value that's
|> possibly smaller than the actual case will ultimately turn out to be.
|>
|> In case 3, we get literal consistency in the sense that always
|> returning "1" is consistent. However, whether or not this is more or
|> less consistent with the users view is an open question. A user who
|> imagines that processing a sequence of documents with a step is
|> analagous to performing an xsl:for-each operation over the sequence
|> may be quite surprised to learn that position() returns "1" even after
|> more than one document has been processed. And with respect to the
|> context size, our answer is "consistent" but no more-or-less correct
|> than in case 2 since last() will always return "1" even when there
|> were 100 documents in the sequence.
|>
|> Given that in no case except 1 is the last() function useful, and
|> given that the definition of p:position() has to exactly mirror what
|> the user might have expected position() to be in case 2, I think it
|> makes sense to use the context position.
|>
|> This also means that steps don't have to understand anything about
|> extension functions.
|
| Thank for this fair summarize
|
| Beeing consistent for me is that
|
| wether we have p:count (as a component)

That's orthogonal, surely.

| and p:position (as a variable
| or a function)

Or just use position()

| AND p:is-last-in-sequence (as a variable or a function)

We don't have p:is-last-in-sequence() so that's not an issue today.

| or we have last() and position() (and we don't do streaming so ....)
|
| having half of that is just a way to make things difficult to user to
| admit the limitation of streaming as consistent

I don't think using position() makes things difficult for the user
(irrespective of what last() returns). I think having position()
return 1 and asking the user remember to use p:position() makes things
difficult for the user.

| one other point no one has made is that I will never be able to use
| position() in Pattern
| <p:viewport match="p[p:position() &lt; $max-to-process]">
| in not easily done with your proposal

Match only takes a single document, so p:position() will only ever be 1.

If instead, you try

 <p:for-each select="//p[p:position() &lt; $max-to-process]">

then you get, uhm, all the "p" elements in each document that's less
than the $max-to-process'th document and none of the "p" elements in
any other document, I guess.

That seems like another argument in favor of using position() since
position() in that expression would have the natural XPath meaning
and the user wouldn't be encouraged to attempt such things.

I don't think there's any useful way to use p:position() in a match or
select on viewport or for-each. You can use p:head and p:tail to do
anything you might want to do in a much more natural fashion.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | Why shouldn't things be largely absurd,
http://nwalsh.com/            | futile, and transitory? They are so,
                              | and we are so, and they and we go very
                              | well together.-- Santayana

Received on Thursday, 24 May 2007 18:50:59 UTC