Re: p:for-each from Jeni Tennison on 2006-07-25 (public-xml-processing-model-wg@w3.org from July 2006)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Tue, 25 Jul 2006 22:09:32 +0100
To: public-xml-processing-model-wg@w3.org
Message-ID: <44C6888C.5090107@jenitennison.com>
Norm,

Norm Walsh wrote:
> / Jeni Tennison <jeni@jenitennison.com> was heard to say:
> | 2. Rather than using #loop/#matched (or something similar) to
> | reference the individual input documents, as in the above, I think we
> | should let the user provide names for them. A design like:
> |
> |   <p:for-each name="loop">
> |     <p:declare-input port="chapter"
> |                      ref-each="#pipe/document"
> |                      select="//chapter" />
> |     <p:declare-output port="validated" />
> |     <p:declare-output port="errors" />
> |
> |     <p:step kind="validate" name="validate">
> |       <p:input port="document" ref="#loop/chapter" />
> |       <p:output port="validated" ref="#loop/validated" />
> |       <p:output port="errors" ref="#loop/errors" />
> |     </p:step>
> |   </p:for-each>
> |
> | would enable this. (The 'ref-each' attribute indicates that the input
> | is one that should be iterated over, rather than a normal input, to
> | enable other kinds of input to be declared too.)
> 
> Now you seem to be suggesting that users declare their inputs, but when I
> suggested this previously, you thought it was unnecessary. I'm confused.

Obviously you have to declare the input that you're iterating over, 
otherwise the processor won't know what to do. I've been saying that you 
shouldn't *have* to declare all the other inputs that you use within the 
loop.

> Do you imagine that either or both of the following constraints apply?
> 
> * In a p:for-each, exactly one p:declare-input must have an @select
>   attribute.

(Given no joins as discussed below:) In a p:for-each, there must be 
exactly one p:declare-input that has a @ref-each attribute (and may have 
a @select attribute to indicate iteration over elements). This 
p:declare-input tells you what you're iterating over.

(We could design it so that there were three options:

   - a @ref-each attribute (to iterate over documents)
   - a @ref attribute and a @select attribute (to iterate over elements)
   - a @ref-each attribute and a @select attribute (to iterate over
     elements from multiple documents)

I don't have a particular opinion either way at the moment.)

> * In the steps of a p:for-each, any p:input reference to a "global"
>   input (i.e. one not renamed with @ref-each) is consumed on the first
>   iteration and is "empty" for subsequent iterations.

No, I don't imagine that. I imagine that, behind the scenes, the 
processor makes enough copies of any p:input reference to a 'global' 
input to have one per iteration.

I imagine the same thing if the 'global' input is renamed using a 
p:declare-input with a @ref attribute (and no @select attribute).

> | 3. If we adopted the above design, we *could* support joins. For
> | example, the following would transform each of the chapters in the
> | pipe's document input with each of the stylesheets in the pipe's
> | stylesheets input:
> |
> | <p:pipeline name="pipe">
> |   <p:declare-input port="document" />
> |   <p:declare-input port="stylesheets" />
> |   <p:declare-output port="results" />
> |
> |   <p:for-each name="loop">
> |     <p:declare-input port="chapter"
> |                      ref-each="#pipe/document"
> |                      select="//chapter" />
> |     <p:declare-input port="stylesheet"
> |                      ref-each="#pipe/stylesheets" />
> |     <p:declare-output port="results" ref="#pipe/results" />
> |
> |     <p:step kind="xslt" name="transform">
> |       <p:input port="document" ref="#loop/chapter" />
> |       <p:input port="stylesheet" ref="#loop/stylesheet" />
> |       <p:output port="result" ref="#loop/results" />
> |     </p:step>
> |   </p:for-each>
> |
> | </p:pipeline>
> 
> Yikes! You're suggesting that if there are three documents on #pipe/document
> (a, b, and c) and two stylesheets on #pipe/stylesheets (x and y), that
> the step gets evaluated (a, x), (a, y), (b, x), (b, y), (c, x), (c, y) times?
> I think that's *way* too confusing.

It's a familiar thing to XPath 2.0 users ;)
for $i in (a,b,c), $j in (x,y) return ($i,$j)

But I stressed the *could*. I think for v1.0 we shouldn't support joins; 
if we use p:declare-input to indicate what's iterated over (rather than 
attributes on p:for-each), then we can always add it later if there's 
the demand.

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Wednesday, 26 July 2006 08:04:51 UTC