Re: p:for-each from Norman Walsh on 2006-07-24 (public-xml-processing-model-wg@w3.org from July 2006)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Mon, 24 Jul 2006 16:06:59 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <87irlmydgc.fsf@nwalsh.com>
/ Jeni Tennison <jeni@jenitennison.com> was heard to say:
| Some thoughts on for-each:
|
| 1. I think we need more than one with-ouput (or declare-output as we
| have it now). The contained steps can produce multiple outputs; it
| seems weird to lose those outputs. For example, say that the
| 'validate' step produces a copy of the original document with
| defaulted & fixed attributes/elements added, plus a set of
| errors/warnings from the document. To validate each chapter and
| capture all the validated documents and errors, you'd need:
|
|   <p:for-each select="//chapter" ref="#pipe/document" name="loop">
|     <p:declare-output port="validated" />
|     <p:declare-output port="errors" />
|
|     <p:step kind="validate" name="validate">
|       <p:input port="document" ref="#loop/#matched" />
|       <p:output port="validated" ref="#loop/validated" />
|       <p:output port="errors" ref="#loop/errors" />
|     </p:step>
|   </p:for-each>

Yes, I imagined that a for-each could declare as many outputs as it
wanted. When the for-each finishes, each output will have "in it" the
sequence of documents written to it by the steps that were evaluated.

| 2. Rather than using #loop/#matched (or something similar) to
| reference the individual input documents, as in the above, I think we
| should let the user provide names for them. A design like:
|
|   <p:for-each name="loop">
|     <p:declare-input port="chapter"
|                      ref-each="#pipe/document"
|                      select="//chapter" />
|     <p:declare-output port="validated" />
|     <p:declare-output port="errors" />
|
|     <p:step kind="validate" name="validate">
|       <p:input port="document" ref="#loop/chapter" />
|       <p:output port="validated" ref="#loop/validated" />
|       <p:output port="errors" ref="#loop/errors" />
|     </p:step>
|   </p:for-each>
|
| would enable this. (The 'ref-each' attribute indicates that the input
| is one that should be iterated over, rather than a normal input, to
| enable other kinds of input to be declared too.)

Now you seem to be suggesting that users declare their inputs, but when I
suggested this previously, you thought it was unnecessary. I'm confused.

Do you imagine that either or both of the following constraints apply?

* In a p:for-each, exactly one p:declare-input must have an @select
  attribute.

* In the steps of a p:for-each, any p:input reference to a "global"
  input (i.e. one not renamed with @ref-each) is consumed on the first
  iteration and is "empty" for subsequent iterations.

| 3. If we adopted the above design, we *could* support joins. For
| example, the following would transform each of the chapters in the
| pipe's document input with each of the stylesheets in the pipe's
| stylesheets input:
|
| <p:pipeline name="pipe">
|   <p:declare-input port="document" />
|   <p:declare-input port="stylesheets" />
|   <p:declare-output port="results" />
|
|   <p:for-each name="loop">
|     <p:declare-input port="chapter"
|                      ref-each="#pipe/document"
|                      select="//chapter" />
|     <p:declare-input port="stylesheet"
|                      ref-each="#pipe/stylesheets" />
|     <p:declare-output port="results" ref="#pipe/results" />
|
|     <p:step kind="xslt" name="transform">
|       <p:input port="document" ref="#loop/chapter" />
|       <p:input port="stylesheet" ref="#loop/stylesheet" />
|       <p:output port="result" ref="#loop/results" />
|     </p:step>
|   </p:for-each>
|
| </p:pipeline>

Yikes! You're suggesting that if there are three documents on #pipe/document
(a, b, and c) and two stylesheets on #pipe/stylesheets (x and y), that
the step gets evaluated (a, x), (a, y), (b, x), (b, y), (c, x), (c, y) times?
I think that's *way* too confusing.

| You'd invoke it with something like:
|
|   <p:step kind="pipe">
|     <p:input port="document" href="book.xml" />
|     <p:input port="stylesheets"
|              href="docbook2html.xsl docbook2fo.xsl" />
|     <p:output port="result" />
|   </p:step>
|
| The alternative is nested for-eaches, of course:
|
| <p:pipeline name="pipe">
|   <p:declare-input port="document" />
|   <p:declare-input port="stylesheets" />
|   <p:declare-output port="results" />
|
|   <p:for-each name="loop1">
|     <p:declare-input port="chapter"
|                      ref-each="#pipe/document"
|                      select="//chapter" />
|     <p:declare-output port="results" ref="#pipe/results" />
|
|     <p:for-each name="loop2">
|       <p:declare-input port="stylesheet"
|                        ref-each="#pipe/stylesheets" />
|       <p:declare-output port="results" ref="#loop1/results" />
|
|       <p:step kind="xslt" name="transform">
|         <p:input port="document" ref="#loop1/chapter" />
|         <p:input port="stylesheet" ref="#loop2/stylesheet" />
|         <p:output port="result" ref="#loop2/results" />
|       </p:step>
|     </p:for-each>
|   </p:for-each>
|
| </p:pipeline>
|
| which isn't too bad.

s/isn't too bad/is much better/ :-)

                                        Be seeing you,
                                          norm

-- 
Norman Walsh
XML Standards Architect
Sun Microsystems, Inc.
Received on Monday, 24 July 2006 20:07:06 UTC