Re: How to run unconnected steps in sequence? from David on 2010-11-01 (xproc-dev@w3.org from November 2010)

From: David <dlee@calldei.com>
Date: Mon, 01 Nov 2010 09:41:34 -0400
To: vojtech.toman@emc.com
CC: xproc-dev@w3.org
Message-ID: <4CCEC38E.5090306@calldei.com>
I don't believe this is strictly true.
A pipe creates a dependency of the start of the output of A to the 
beginning of the input of B.
It does  not to my knowledge create a dependency that A *completes 
execution* before B begins.
Or in fact that they don't start in any order but B is waiting when it 
starts to read the data from the pipe.
Atleast that was my understanding last time I read the spec (could well 
be wrong).
Also I know in Calabash (when I last looked at the code) the processing 
is single threaded, thus data dependency == completion dependency, but I 
do not believe its guaranteed to behave that way.
But I believe a conforming processor could implement steps in a pipe 
asynchronously as long as the data flowing between them is synchronized.


For an analogy, in the unix pipeline (and also in xmlsh)
     a | b

The process (or in xmlsh's case; thread)  "b" is (or may be, depending 
on the implementation) started first.
If b is not consuming data from its input it may actually run to 
completion before "a" even starts.


But then again ... maybe I've misread the specs.



David A. Lee
dlee@calldei.com
http://www.xmlsh.org


On 11/1/2010 9:04 AM, vojtech.toman@emc.com wrote:
>
> Ad 1. The p:pipe element creates a connection – and a dependency – 
> between steps. If the step A contains a p:pipe that points to the step 
> B, it means that A must be executed *after* B. It does not matter if 
> the p:pipe is in p:input, p:with-option, p:with-param, or p:variable. 
> All p:pipe elements contained in the step contribute edges to the 
> dependency graph.
>
> Ad 2: There is no guarantee; but you don’t care in this case. The 
> p:load does not depend on p:sink being executed first. I used p:sink 
> just to introduce a side effect-free dependency on p:store. (You can 
> also use other steps than p:sink, provided they don’t introduce side 
> effects that would change the result of your pipeline.)
>
> Regards,
>
> Vojtech
>
> --
>
> Vojtech Toman
>
> Consultant Software Engineer
>
> EMC | Information Intelligence Group
>
> vojtech.toman@emc.com
>
> http://developer.emc.com/xmltech
>
> *From:*Jostein Austvik Jacobsen [mailto:josteinaj@gmail.com]
> *Sent:* Monday, November 01, 2010 1:46 PM
> *To:* Toman, Vojtech
> *Cc:* xproc-dev@w3.org
> *Subject:* Re: How to run unconnected steps in sequence?
>
> 1. Thanks. I didn't know dependencies could be introduced that way. Is 
> this similar to what happens when you have multiple p:pipes in a p:input?
>
> 2. and 3.: How am I guaranteed that p:load won't run before the p:sink?
>
> Regards
>
> Jostein
>
> 2010/11/1 <vojtech.toman@emc.com <mailto:vojtech.toman@emc.com>>
>
> The solution is to introduce the dependency explicitly. Here are some 
> examples (all are variations on the same theme, but some may be more 
> applicable to your use case):
>
> 1.
>
> <p:store href=”file.xml” name=”store”/>
>
> <p:load>
>
> <p:with-option name=”href” select=”’file.xml’”>
>
> <p:pipe step=”store” port=”result”/>
>
> </p:with-option>
>
> </p:load>
>
> 2.
>
> <p:store href=”file.xml” name=”store”/>
>
> <p:group>
>
> <p:sink>
>
> <p:input port=”source”>
>
> <p:pipe step=”store” port=”result”/>
>
> </p:input>
>
> </p:sink>
>
> <p:load href=”file.xml”/>
>
> </p:group>
>
> 3.
>
> <p:group>
>
> <p:store href=”file.xml”/>
>
> <p:identity>
>
> <p:input port="source">
>
> <p:empty/>
>
> </p:input>
>
> </p:identity>
>
> <p:group>
>
> <p:group>
>
> <p:sink/>
>
> <p:load href=”file.xml”/>
>
> </p:group>
>
> Some processors also support extension attributes to control 
> dependencies between steps, but I would recommend to avoid this unless 
> absolutely necessary.
>
> Regards,
>
> Vojtech
>
> --
>
> Vojtech Toman
>
> Consultant Software Engineer
>
> EMC | Information Intelligence Group
>
> vojtech.toman@emc.com <mailto:vojtech.toman@emc.com>
>
> http://developer.emc.com/xmltech
>
> *From:*xproc-dev-request@w3.org <mailto:xproc-dev-request@w3.org> 
> [mailto:xproc-dev-request@w3.org <mailto:xproc-dev-request@w3.org>] 
> *On Behalf Of *Jostein Austvik Jacobsen
> *Sent:* Monday, November 01, 2010 1:01 PM
> *To:* xproc-dev@w3.org <mailto:xproc-dev@w3.org>
> *Subject:* How to run unconnected steps in sequence?
>
> I remember seeing a note on this problem somewhere, but I can't find 
> it. Say I want to run these two steps in sequence:
>
> <p:store href="file.xml"/>
>
> <p:load href="file.xml"/>
>
> p:load would have to run after p:store, or the file wouldn't be there 
> yet. Since p:store has no primary output and p:load has no primary 
> input, the processor may choose the order they are run in.
>
> Is there a standard pattern for solving such issues? Something 
> general, not just for the store/load use-case?
>
> Regards
>
> Jostein Austvik Jacobsen
>
Received on Monday, 1 November 2010 13:42:11 UTC