Flows and Pipes - a simplification? from Alex Milowski on 2006-06-07 (public-xml-processing-model-wg@w3.org from June 2006)

From: Alex Milowski <alex@milowski.org>
Date: Wed, 07 Jun 2006 16:48:56 -0700
To: public-xml-processing-model-wg <public-xml-processing-model-wg@w3.org>
Message-ID: <448765E8.6050402@milowski.org>

I've been struggling with the idea that there is a large jump in
complexity from a single straight through pipeline and those that
multiple flows, inputs, outputs, etc.

I wonder if there is a simplification that could be really useful in
distinguishing between chains of input/output processing and flows
(graphs) of steps with multiple inputs and output, conditions on them,
collections, etc.

If you look at our requirement document, 26 of the 32 use cases can be
consider straight-through pipes where there some kind of explicit input
that drives the production of the output of each step.  In turn, this
means that evaluating XPath expressions upon such input is very simple.

For example, Henry's example is use case 5.3 and has an implicit
driving input and output of each step:

<pipe>
  <step process="xsdValidate">
   <input name="schemaDoc" href="my.xsd"/>
  </step>
  <step process="xslt1.0">
   <input name="stylesheet" href="my.xsl"/>
  </step>
</pipe>

We could just consider actually making pipes, well, pipes in that
they are a sequence of steps with primary inputs and outputs. Then in
Henry's second example with conditions has a natural context
in which the expressions are evaluated--which is the primary
input.


For more complex processing, a "flow" can be an orchestration of these 
pipes where inputs and outputs, conditions, etc. can be bound between them.

The good news here is that the simplest cases (those 26 of 32 use cases)
become very simple to write.  They also have the the advantage of being
very reusable.


For example, if I want to apply a sequence of operations to
a collection of documents, I can re-use the pipe:

<for-each input="$my-collection">
<pipe>
  <step process="xsdValidate">
   <input name="schemaDoc" href="my.xsd"/>
  </step>
  <step process="xslt1.0">
   <input name="stylesheet" href="my.xsl"/>
  </step>
</pipe>
</for-each>

The output of this would be a collection.

Similarly, element iteration natually fits as step that orchestrates a
pipe:

<for-each element select="/doc/section">
<pipe>
<step process="xslt1.0">
   <input name="stylesheet href="my.xsl"/>
</step>
</pipe>
</for-each>

The result is the complexity of sequences of documents, multiple inputs
and outputs, conditions across multiple inputs are all segmented to
"flows" and not "pipes".

In the end, the simple cases are simple.

--Alex Milowski

Received on Wednesday, 7 June 2006 23:49:12 UTC