Re: Annotations for side effects and stability from Norman Walsh on 2006-04-27 (public-xml-processing-model-wg@w3.org from April 2006)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Thu, 27 Apr 2006 06:14:24 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <87wtdb8gvz.fsf@nwalsh.com>
/ Alessandro Vernet <avernet@orbeon.com> was heard to say:
| 1) Is p:foo execute twice if this pipeline is run twice?
|
|     <p:pipeline>
|         <p:output ref="foo"/>
|         <p:step name="p:foo">
|             <p:input href="foo.xml"/>
|             <p:output label="foo"/>
|         </p:step>
|     </p:pipeline>

I think our specification has to describe what happens during any
single pipeline execution, just like the XSLT specification says what
happens during any single transformation. If your application
environment is smart enough to determine that the result of applying a
pipeline twice (or a stylesheet twice, or any other application twice)
will be indistinguishable from just handing back a cached value, I
think it's free to do so.

My point is, if the user can't tell, then who's to say which occurred?

I don't see how anything we put in the pipeline spec can help your
larger application environment "know" if the pipeline generates the
same output. I think this is especially true given that we seem to be
comfortable with *individual components* (like the WS component or
timestamp component) that can't even be guaranteed to generate the
same output from the same input *within a single pipeline*.

| 2) Is p:foo execute twice in this pipeline?
|
|     <p:pipeline>
|         <p:output ref="foo1"/>
|         <p:output ref="foo2"/>
|         <p:step name="p:foo">
|             <p:input href="foo.xml"/>
|             <p:output label="foo1"/>
|         </p:step>
|         <p:step name="p:foo">
|             <p:input href="foo.xml"/>
|             <p:output label="foo2"/>
|         </p:step>
|     </p:pipeline>
|
| I think those two questions are closely related, and I would like to
| give the same answer in both cases: a pipeline engine is free to skip
| the second execution of p:foo if it knows that the second execution
| will generate the same result and that p:foo doesn't have any
| (significant) side effect.

I think they're completely different questions. Within the context of
a single pipeline execution, we could adopt a view like XSLT that says
that documents have to be stable. In that case, the pipeline engine
would know that foo.xml in the second step is identical to foo.xml in
the first. And if it knew that the p:foo step was functional, it could
skip execution of the second step.

In the case of running the entire pipeline twice, we don't have any
mechanism for determining if foo.xml has changed. (Again, your larger
application environment might be able to determine this and in that case
I suppose it can safely return the cached value if it knows that every
component in the pipeline is functional.)

| [I am diverging from the original question, but in my mind, updating a
| database or calling a web service is significant side effect, but
| generating an additional line in a log file or adding a
| "If-Modified-Since" header to an HTTP request is not a significant
| side effect. I am not sure if we will be able to define precisely what
| a "significant side effect" is. Most likely the decision will need to
| be taken on a component per component basis.]

I feel that my attempts to persuade the WG that there's value in
having components identified as being functional has so far failed and
I'm inclined to abandon it. Although it seems reasonable to me, I'd
like not to delay WG progress for it any further, if we can get
consensus to abandon it. I don't actually think that pipelines like 2
above occur very often. And if they do, and if the user really wants
to make sure that p:foo is only executed once, it can be rewritten:

     <p:pipeline>
         <p:output ref="foo1"/>
         <p:output ref="foo2"/>
         <p:step name="p:foo">
             <p:input href="foo.xml"/>
             <p:output label="tee"/>
         </p:step>
         <p:step name="p:tee">
             <p:input ref="tee"/>
             <p:output label="foo1"/>
             <p:output label="foo2"/>
         </p:step>
     </p:pipeline>

                                        Be seeing you,
                                          norm

-- 
Norman Walsh
XML Standards Architect
Sun Microsystems, Inc.
Received on Thursday, 27 April 2006 10:14:36 UTC