URIs as inputs and outputs from Jeni Tennison on 2006-04-05 (public-xml-processing-model-wg@w3.org from April 2006)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Wed, 05 Apr 2006 20:50:32 +0100
To: public-xml-processing-model-wg@w3.org
Message-ID: <44341F88.702@jenitennison.com>

Hi,

I've been toying with an idea and I'd like to see whether others think 
it's worth pursuing.

The crux is that instead of passing *documents* between steps, we pass 
*URIs*. The pipeline processor acts as a resource manager/entity 
resolver. Whenever a component wants to read a document, it requests it, 
using the URI, from the pipeline processor. Whenever a component wants 
to write a document, it registers it with a particular URI with the 
pipeline processor (and that URI can be supplied to the component when 
the step is defined).

For example, in something like:

   <p:step use="xslt2.0">
     <p:input name="source" href="a.xml" />
     <p:input name="stylesheet" href="b.xsl" />
     <p:output name="result" href="c.xml" />
   </p:step>

means that the XSLT processor is passed the URI 'a.xml' associated with 
the name 'source', 'b.xml' associated with the name 'stylesheet' and 
'c.xml' associated with the name 'result'. The XSLT processor requests 
the documents associated with 'a.xml' and 'b.xsl' from the pipeline 
processor. When it's done, the XSLT processor tells the pipeline 
processor to associate the resulting document with the URI 'c.xml'.

If a URI is requested and that URI is explicitly specified as an output 
of a step, then the returned document must be the same as that generated 
by the component. It's an error if creating that document entails 
requesting that URI (i.e. you can't have circular pipelines).

Within a single execution, a pipeline processor will always deliver the 
same document for a given URI: it's stable for the duration of the 
pipeline. Also, only one document can be registered with a particular 
URI during a single execution, so you don't get documents overwritten.

Using URIs rather than documents unifies the treatment of inputs that 
are passed explicitly to the component and inputs that it uses 
internally (e.g. through the doc() function in XSLT or XQuery or 
referenced within the document being processed), which means that those 
inputs can also be the results of other steps in the pipeline.

Having URIs for all the intermediate documents means that it's easy to 
look at the intermediate results, which is handy for debugging.

I have a feeling that there's a glaring problem with this that I'm 
missing. Perhaps it rests too heavily on a backward-chaining assumption. 
Any thoughts?

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com

Received on Wednesday, 5 April 2006 19:50:41 UTC