Naming steps or naming outputs from Alessandro Vernet on 2006-06-29 (public-xml-processing-model-wg@w3.org from June 2006)

From: Alessandro Vernet <avernet@orbeon.com>
Date: Thu, 29 Jun 2006 01:53:22 -0700
To: public-xml-processing-model-wg <public-xml-processing-model-wg@w3.org>
Message-ID: <4828ceec0606290153u55c2c2d2vfd88b21e7dbf6628@mail.gmail.com>

I think we are on the right track with Richard's proposal for a
conditional constructs [1]. But looking closely at the syntax, the
question of whether we should name steps or outputs came back to mind.
Essentially the question is whether we should:

1) Name outputs:

    <!-- Step -->
    <step type="xslt">
        <with-input name="stylesheet" uri="http://example.org/foo.xsl"/>
        <with-input name="source" from="source"/>
        <with-output name="result" name="result"/>
    </step>

    <!-- Reference to step output -->
    <output name="result" from="result"/>

2) Name steps:

    <!-- Step -->
    <step name="ss" type="xslt">
        <with-input name="stylesheet" uri="http://example.org/foo.xsl"/>
        <with-input name="source" from="source"/>
    </step>

    <!-- Reference to step output -->
    <output name="result" from="ss.result"/>

I had a slight preference for naming steps for a while, thinking that
this way we don't need to declare outputs, which makes the syntax more
lightweight. However, I now think that it would be better to name
outputs instead of steps, for the following reasons:

a) In my experience, for some pipeline it is more natural to name
steps while for others it is more natural to name outputs. It depends
on the "type" of pipeline. The one where it is more natural to name
steps are the linear one. For the other cases it is more natural to
name outputs.

Say you read a document, validate it, transform it, and save it. It
makes more sense to give name to the steps (read, validate, transform)
rather than the outputs of the steps (document-read,
document-validated, document-transformed).

With the pipe construct, we wouldn't have to name anything in the
those linear cases. So we are left with the other cases, where in my
experience it is more natural to name outputs.

b) By naming outputs, we avoid a constructed name
(stepName.outputName). We also avoid having at the same time
constructed names and simple names, depending on if the reference is
to a step output or pipeline input. By naming outputs, we would always
have simple names.

c) Having a <with-output> give us the option of adding in the future
more attributes on inputs/outputs. For instance we could reference a
schema there, to type an input/output. Or we could give an indication
that we want to document to be logged, maybe for debugging:

    <with-input name="source" from="source" schema-uri="..." trace="message"/>
    <with-output name="result" name="result" schema-uri="..." trace="message"/>

d) Having both <with-input> and <with-output> is more consistent and
more explicit. It is easier for someone who reads the pipeline to
understand what the input/outputs of each step are.

Am I missing something?

Alex

[1] http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2006Jun/0029.html
-- 
Blog (XML, Web apps, Open Source):
http://www.orbeon.com/blog/

Received on Thursday, 29 June 2006 09:00:53 UTC