W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > June 2007

Re: Making vanilla implementation of position() DTRT

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Fri, 08 Jun 2007 12:06:47 +0100
To: Jeni Tennison <jeni@jenitennison.com>
Cc: public-xml-processing-model-wg@w3.org
Message-ID: <f5b645y903c.fsf@hildegard.inf.ed.ac.uk>

Hash: SHA1

Jeni Tennison writes:

> Norman Walsh wrote:
>> / ht@inf.ed.ac.uk (Henry S. Thompson) was heard to say:
>> [...Henry's cogent arguments elided...]
>> Looking at the example in Henry's mail, it suddenly became clear to
>> me
>> why we can't use position() inside a for-each or viewport as we've
>> been considering. The observation is so simple, I can't believe no one
>> else made it before, so let me know if I'm once again overlooking the
>> obvious.
>> Consider his example:
>>  <p:for-each name="myloop">
>>    <p:group>
>>      <p:option name="index" select="position()"/>
>>       . . .
>> What is the context for that p:option? Depending on what we say about
>> the default context for p:option, it's equivalent to one of the
>> following two rewrites:
>>    <p:option name="index" select="position()">
>>      <p:empty/>
>>    </p:option>
>> or
>>    <p:option name="index" select="position()">
>>      <p:pipe step="myloop" port="current"/>
>>    </p:option>
>> In the former case, the expression is a dynamic error. In the latter
>> case, position() = 1 because current doesn't produce a sequence.
> This is how I suggest we define it, so that it does work in the way
> we'd expect it to:
> 1. We add "current document" and "current document sequence" to the
> environment. These are set as follows:
>  (a) for <p:viewport> and <p:for-each>, the current document sequence
> is the sequence of documents being processed by the viewport or
> for-each, and the current document is the current document being
> processed by the viewport or for-each (which is also bound to the
> 'current' port).

Does this mean that viewport and for-each can't stream?  That seems a
very high price to pay!  If they _can_ stream, how do they know what
the document sequence is?

>  (b) for <p:pipeline>, the current document and current document
> sequence are undefined.
>  (c) for all other steps, the current document and current document
> sequence are the same as those of its container step. (For example, a
> group or choose doesn't change the current document or current
> document sequence.)
> NOTE that neither the current document sequence nor the current
> document is the same as the default readable port: within the scope of
> a for-each, the default readable port changes between steps, but the
> current document sequence does not.
> 2. XPath expressions in the context of a pipeline (i.e. those that
> aren't passed as options to steps) are evaluated differently depending
> on what source they use for their context (as set by
> <p:xpath-context>, the <p:pipe> within an option or parameter and so
> on):
>   if it's set to the current port of a for-each or viewport (either
>   explicitly or implicitly [when the default readable port is the
>   'current' port]), then:
>      * context node = the current document
>      * context position = the position of the current document in the
>        current document sequence

This is virtually identical to my proposal -- note that as I said,
this means in practice it will only work for the first step in the
subpipelines, and then only if it reads the DRP.

Net-net -- your proposal only uses the
context:'current-document-sequence' on the 'get it right' option wrt
last(), _when_ last() is evaluated _by the engine_ on the first step
when that evaluation is tied to the DRP.  It's not worth it.

Once you give up on context:'current-document-sequence' you analysis
is identical to mine, only the question of whether position() should
have a special meaning in a very limited and hard to delimit set of
cases.  Again, in my view it's just not worth it, and will require
people to use a p:group and an option to bind to position() right at
the beginning almost all the time.  I _much_ prefer to just say that
for-each and viewport bind p:index to the iteration number, end of

Norm, note that I don't think p:index('stepname') is needed -- you
yourself pointed out months ago that if you need to reference an index
that's not the one immediately over your head, you just do e.g.

  <p:option name="super-index" select="$p:index"/>
  . . .
   <p:option name="href" value="concat('save_',$super-index,'_',$p:index)"/>

And you'll get stored results in save_1_1, save_1_2, save_2_1, etc.

- -- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
Version: GnuPG v1.2.6 (GNU/Linux)

Received on Friday, 8 June 2007 11:06:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:53 GMT