Tuning XProc pipelines

One thing the spec doesn't currently talk about particularly is deployment  
options for XProc pipelines, and the "tuning" thereof.  There is the text

----
Unless otherwise indicated, implementations must not assume that steps are  
functional (that is, that their outputs depend only on their explicit  
inputs, options, and parameters) or side-effect free.
----

but it seems to me (based on what some implementers have written publicly)  
that some implementations will be designed to run pipeline steps in an  
overlapping, semi-parallel fashion so that they can stream information  
 from one step to the next and reduce the end-to-end processing time for  
the pipeline.

However, there are cases where a particular step can't be overlapped with  
other steps.  One reason is because of side-effects which affect preceding  
or following steps (as noted in the spec).  Another reason is because a  
particular step requires so much memory to run on a particular system  
(giving its particular input documents/data) that the steps cannot be  
executed in parallel (in that circumstance, you would also typically want  
to write the intermediate XML to disk from each step and read it back  
again for the next step).

Any thoughts on how such deployment "tuning" should be managed?  Is it  
something that the specification might support directly?  Is it something  
that might be supported by a separate deployment binding specification  
that allows those deployment options to be specified independently of the  
functional aspects of the pipeline?  Or is it something that will be left  
to implementations to resolve in a proprietary fashion?

Thanks, Cheers, Tony.
-- 
Anthony B. Coates
Senior Partner
Miley Watts LLP
Experts In Data
+44 (79) 0543 9026
Data standards participant: genericode, ISO 20022 (ISO 15022 XML),  
UN/CEFACT, MDDL, FpML, UBL.
http://www.mileywatts.com/

Received on Thursday, 26 April 2007 15:02:07 UTC