- From: Innovimax SARL <innovimax@gmail.com>
- Date: Sat, 26 May 2007 10:35:06 +0200
- To: "Norman Walsh" <ndw@nwalsh.com>
- Cc: public-xml-processing-model-wg@w3.org
On 5/24/07, Norman Walsh <ndw@nwalsh.com> wrote: > / "Innovimax SARL" <innovimax@gmail.com> was heard to say: > | On 5/24/07, Norman Walsh <ndw@nwalsh.com> wrote: > |> Here are the options as I see them: > |> > |> 1. We use context position and context size and we make them be > |> correct. The context size is the number of documents in the sequence > |> and the context position is the number of the document in that > |> sequence. > |> > |> 2. We use context position and context size and we accept the fact > |> that we can't reliably set the context size, so we always make it > |> equal to the context position. (Or we always make it MAXINT; I don't > |> much care.) > |> > |> 3. We don't use context position and context size and we always > |> set them both to "1". We use a separate extension function to return > |> the number of the document in the sequence. > |> > |> Only in case 1 do we get complete consistency. But that totally > |> prevents a streaming implementation and requires (possibly massive > |> amounts of) buffering. I don't think it'll be difficult to persuade > |> our users that this is an unattractive option. (What's more, if they > |> have a step that actually really needs to know how many documents > |> are in the sequence, they can compute it with p:count.) > | > | This is just exactly true > | In a p:for-each, you cannot have an equivalent of last() because you > | will never at the previous step know how many match there will be. > | > | Just take <p:for-each select="//*[@xml:id]"> in an sequence of > | enormous documents as an example > > As Jeni just pointed out, we could insist that you count them if > someone asked for last(). > > |> In case 2, we get consistency in the sense that, by analogy with the > |> xsl:for-each case, the context position (and the position() function) > |> will count the documents, just as a user would expect. We get some > |> inconsistency because the last() function returns a value that's > |> possibly smaller than the actual case will ultimately turn out to be. > |> > |> In case 3, we get literal consistency in the sense that always > |> returning "1" is consistent. However, whether or not this is more or > |> less consistent with the users view is an open question. A user who > |> imagines that processing a sequence of documents with a step is > |> analagous to performing an xsl:for-each operation over the sequence > |> may be quite surprised to learn that position() returns "1" even after > |> more than one document has been processed. And with respect to the > |> context size, our answer is "consistent" but no more-or-less correct > |> than in case 2 since last() will always return "1" even when there > |> were 100 documents in the sequence. > |> > |> Given that in no case except 1 is the last() function useful, and > |> given that the definition of p:position() has to exactly mirror what > |> the user might have expected position() to be in case 2, I think it > |> makes sense to use the context position. > |> > |> This also means that steps don't have to understand anything about > |> extension functions. > | > | Thank for this fair summarize > | > | Beeing consistent for me is that > | > | wether we have p:count (as a component) > > That's orthogonal, surely. > > | and p:position (as a variable > | or a function) > > Or just use position() > > | AND p:is-last-in-sequence (as a variable or a function) > > We don't have p:is-last-in-sequence() so that's not an issue today. > > | or we have last() and position() (and we don't do streaming so ....) > | > | having half of that is just a way to make things difficult to user to > | admit the limitation of streaming as consistent > > I don't think using position() makes things difficult for the user > (irrespective of what last() returns). I think having position() > return 1 and asking the user remember to use p:position() makes things > difficult for the user. > > | one other point no one has made is that I will never be able to use > | position() in Pattern > | <p:viewport match="p[p:position() < $max-to-process]"> > | in not easily done with your proposal > > Match only takes a single document, so p:position() will only ever be 1. Oh damned ! I see my mistake So I think that p:blabla_index was doing what I said ! > > If instead, you try > > <p:for-each select="//p[p:position() < $max-to-process]"> > > then you get, uhm, all the "p" elements in each document that's less > than the $max-to-process'th document and none of the "p" elements in > any other document, I guess. > > That seems like another argument in favor of using position() since > position() in that expression would have the natural XPath meaning > and the user wouldn't be encouraged to attempt such things. > > I don't think there's any useful way to use p:position() in a match or > select on viewport or for-each. You can use p:head and p:tail to do > anything you might want to do in a much more natural fashion. > I think I'm almost done ! I now see the difference of what I wanted which was a mix of position() and p:blabla_index But I still think that some of us got the mistake still wrong <!-- the default input is a document --> <p:for-each select="//p"> <p:option name="my-pos" select="position()"/> </p:for-each> what are the successive values of $my-pos in that case ? Mohamed -- Innovimax SARL Consulting, Training & XML Development 9, impasse des Orteaux 75020 Paris Tel : +33 8 72 475787 Fax : +33 1 4356 1746 http://www.innovimax.fr RCS Paris 488.018.631 SARL au capital de 10.000 €
Received on Saturday, 26 May 2007 08:35:08 UTC