Re: p:pipeline from Jeni Tennison on 2006-07-21 (public-xml-processing-model-wg@w3.org from July 2006)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Fri, 21 Jul 2006 14:38:49 +0100
To: public-xml-processing-model-wg@w3.org
Message-ID: <44C0D8E9.7060107@jenitennison.com>

Hi,

Rui Lopes wrote:
> Norman Walsh wrote:
>> True. But requiring all pipelines to be in external files strikes me
>> as analagous to requiring all xsl:templates to be in separate files.
> 
> I've been thinking lately about the issue of including and reusing 
> pipelines. Is this e-mail I'll be using the expressions "main pipeline" 
> and "called pipeline" to avoid potentially ambiguous interpretations. I 
> believe that we should be aware of three sides for reusing pipelines:
> 
> 1) inline specification in the "main pipeline" document: useful when we 
> don't want to repeat a sequence of steps/processing logic inside our 
> "main pipeline" document. This is something like defining a function on 
> some programming language;

I'd say it was like defining a *local* function in some programming 
language (i.e. a function within a function).

> 2) include a "called pipeline" into a "main pipeline": useful when 
> creating more complex processing applications, whether focusing on 
> modularization for defining "called pipeline" libraries, or just to ease 
> management and maintenance of bigger pipeline-based projects. This is 
> similar to using a C preprocessor or an XSLT include directive;
> 
> 3) use an external "called pipeline": useful when some pipeline logic is 
> executed elsewhere, whether for resource-intensive computations or using 
> a pipeline service provided by someone else. This is akin to calling a 
> web-service.

I don't think we need to worry about 1) if we have support for 2): it's 
always an option for users to write a separate 'called pipeline' if they 
want to repeat processing logic inside the main pipeline.

FWIW, this is my favoured design at the moment:

1. All components are identified through a QName

2. Pipelines are components (and are thus identified by a QName, and 
called using the <p:step> syntax)

3. Pipelines are all defined at the same level (no nested pipelines), 
with a <p:pipelines> wrapper (if necessary; if a document only defines 
one pipeline, then it shouldn't be necessary)

4. We have a <p:import> to import pipeline definitions kept in separate 
physical files

5. Pipeline invocation uses a component library, a component name, and a 
set of inputs and parameters. The component library includes the 
built-in components, pipeline components defined in XProc files, and 
implementation-defined components (which might be web services etc.)

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com

Received on Friday, 21 July 2006 13:38:59 UTC