Re: Straight-through or Other Processing? from Rui Lopes on 2006-01-16 (public-xml-processing-model-wg@w3.org from January 2006)

From: Rui Lopes <rlopes@di.fc.ul.pt>
Date: Mon, 16 Jan 2006 10:05:39 +0000
To: Alex Milowski <alex@milowski.org>
CC: public-xml-processing-model-wg@w3.org
Message-ID: <43CB6FF3.8070100@di.fc.ul.pt>
Alex Milowski wrote:
> 
> As I see it, we have three kinds of processing models we can
> consider:
> 
> 1. Straight-through: a pipeline is a sequence of steps where each step's
>    output is chained to the next.
> 
> 2. A dependancy-based model where steps have input dependancies.  A
>    target is choosen and the sequence of steps is determined by chasing
>    these inputs down.
> 
> 3. A parallel model where any step can ask for an additional model and
>    that causes another pipeline "chain" to execute--possibly in
>    parallel.
> 
> It is my belief that (1) is the simplest core and the minimum bar for
> our first specification (and hence a requirement).
> 
> (2) has been tried and found to be a hard way to think about this... but
> it works for people who like ant, make, etc.
> 
> (3) is a natural extension to (1).
> 
> In terms of implementations, smallx uses (1) and, I believe, sxpipe does
> (3).
> 

In my opinion, I believe two major processing models exist (as you've 
somehow stated): (a) parallel/independent, and (b) dependency-based.

In (a) you define several pipelines that will process (potentially) 
different inputs, producing different outputs. These can be composed 
later explicitly in a composition pipeline (e.g. XPL's xpl:pipeline), 
providing a feature similar to your processing model #1. If no 
composition is explicited, each pipeline can be run independently. 
However, we must define the default processing behaviour, whether if all 
pipelines are run, or if specific pipelines must be named as the targets 
to be run. The first approach has some issues if composition is defined 
in the same document (externalize the composition should solve the issue).

(b) can be found on cocoon, ant, make, etc. While it's a well-known 
composition paradigm, it's not so clean and easy to read as explicit 
composition, posing some difficulties in maintenance issues. In this 
model you can specify the dependencies in two ways: (i) explicitly and 
(ii) implicitly. (i) defines its behaviour similarly to ant or make 
(i.e. in xproc pipelines would be named, while dependencies would be 
defined through specific statements - pipelines upon which it depends). 
(ii) defines its dependencies through input/output bindings, like cocoon.

There is another issue on dependency-based models: should it be defined 
through (x) backward chaining, or (xx) forward chaining? (x) can be 
found in ant, cocoon, etc. Here you state what's final result you want 
to have. (xx) behaves similarly to explicit composition, except no 
pipeline has to be defined to specify it. Here you state what has to be 
processed initially.

I believe we must reach an agreement on interactions between pipelines. 
All these proposals have benefits, as well as handicaps that must be 
taken into account. Summarizing, we must consider the following issues:

1. Should composition be supported by the model or should it be an 
implementation issue? (I bet on the first one)

2. If composition is supported by the model, what type of mechanism 
should be allowed?

   a) Parallel/independent + explicit composition
   b) Explicit dependencies
   c) Implicit dependencies
   d) All of the above
   e) None of the above (I hope not!)

3. If dependency-based is chosen as the composition model, how should it 
be defined?

   a) Backward chains
   b) Forward chains
   c) Implementation defined


Cheers,

Rui
Received on Monday, 16 January 2006 10:06:05 UTC