Re: Naming ports vs. naming documents from Norman Walsh on 2006-05-02 (public-xml-processing-model-wg@w3.org from May 2006)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Tue, 02 May 2006 10:22:32 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <87r73cwlp3.fsf@nwalsh.com>
/ Alessandro Vernet <avernet@orbeon.com> was heard to say:
| On 4/28/06, Norman Walsh <Norman.Walsh@sun.com> wrote:
|> I have to say, on the whole, I prefer the labelled ports approach for
|> several reasons:
|> [...]
|
| It looks to me like we are getting closer to a consensus on the issue
| of how steps are connected. Is is fair to say that this group thinks
| that the primary way to connect the output A of a step with the input
| B of another step, is to assign a label to A and make a reference to
| that label in B?

That's what I think, yes.

|> I think we do want a mechanism for indicating that a component
|> produces an output with a URI so that some other component can access
|> it, but I don't think I want to expose all of the plumbing that way.
|
| In a stylesheet, we would like to be able to write: <xsl:import
| href="someURI"/>, where someURI is a reference to output of a step in
| the pipeline. Instead of assigning the URI to the output, as in
| <p:output name="..." produces="someURI"/>, I think it is better to
| make the connection explicit between the step that produces the
| document and the one that consumes the document (see [1] for a
| discussion on this topic), as in:
|
|    <p:step name="...">
|        <p:output label="step-1-output"/>
|    </p:step>
|
|    <p:step name="xslt">
|        <p:define-uri uri="someURI" ref="step-1-output"/>
|    </p:step>
|
| The URI defined here would be valid just during the execution of the
| step where it is defined. Then in the stylesheet one can write
| <xsl:import href="someURI"/> which will go read the output labeled
| "step-1-output".

I appreciate that this approach makes the connection more explicit,
but by making the URI/resource mapping explicit at the step-level, I
think it introduces the possibility of some strange behavior. What's
more, I don't believe that it aids implementation in any way.

Assuming that we all want it to be possible to build a pipeline engine
with stock components (Saxon 8, Xalan, xsv, etc.), we have to assume
that the underlying component, some XSLT engine in this case, is going
to attempt to open "someURI" via the mechanisms that are normal for
it.

That means that the engine must be able to provide the data through
those mechanisms, or it must make sure that the component which
produces that URI has run to completion and it's outputs have been
serialized before it runs the XSLT component.

I think that means that the only thing that the pipeline document must
expose is the URI dependency between the components.

Allowing the URI/resource mapping to appear on a step-by-step basis
introduces the possibility of odd aliasing. Consider:

   <p:step name="...">
       <p:output label="step-1-output"/>
   </p:step>

   <p:step name="...">
       <p:output label="step-2-output"/>
   </p:step>

   <p:step name="xslt">
       <p:define-uri uri="someURI" ref="step-1-output"/>
   </p:step>

   <p:step name="xslt">
       <p:define-uri uri="someURI" ref="step-2-output"/>
   </p:step>

Even if we imagine that the pipeline engine could detect this and
either report it as an error or do the right thing, the problem can
still arise. Consider:

   <p:step name="...">
       <p:output label="step-1-output"/>
   </p:step>

   <p:step name="xslt">
       <p:input name="stylesheet" href="/path/to/style.xsl"/>
       <p:define-uri uri="someURI" ref="step-1-output"/>
   </p:step>

   <p:step name="xslt">
       <p:input name="stylesheet" href="/path/to/style.xsl"/>
   </p:step>

Whatever 'someURI' is, both stylesheet invocations are going to
attempt to retrieve it. In the first case, 'step-1-output' will be
provided. In the second case, something else will or might be
provided.

I'd much rather consider the universe of resources addressable by URI
as a single global pool. That simplifies the pipeline document:

   <p:step name="...">
       ...
       <p:produces href="someURI"/>
   </p:step>

   <p:step name="xslt">
       ...
       <p:consumes href="someURI"/>
   </p:step>

Two steps that p:produces the same URI would be an error. It would
still be possible for the second class of error to arise:

   <p:step name="...">
       ...
       <p:produces href="someURI"/>
   </p:step>

   <p:step name="xslt">
       ...
       <p:consumes href="someURI"/>
   </p:step>

   <p:step name="xslt">
       ...
   </p:step>

but fixing it would be a simple matter of adding the p:consumes element.
(And it wouldn't require a p:tee component to split the output so that
there were two labelled streams for p:define-uri.)

Note also that the case where the step that generates someURI doesn't
assign a URI to it can still be handled with this mechanism using
stock components:

   <p:step name="...">
       ...
       <p:output name="result" label="foo"/>
   </p:step>

   <p:step name="rename">
       <p:parameter name="href" value="someURI"/>
       <p:input name="input" ref="foo"/>
       ...
   </p:step>

   <p:step name="xslt">
       ...
       <p:consumes href="someURI"/>
   </p:step>

                                        Be seeing you,
                                          norm

-- 
Norman Walsh
XML Standards Architect
Sun Microsystems, Inc.
Received on Tuesday, 2 May 2006 14:24:01 UTC