Language construct syntax from Jeni Tennison on 2006-08-16 (public-xml-processing-model-wg@w3.org from August 2006)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Wed, 16 Aug 2006 15:12:10 +0100
To: public-xml-processing-model-wg@w3.org
Message-ID: <44E327BA.8000505@jenitennison.com>
I've tried to find a different syntax that's regular across all the 
language constructs, but all I could come up with was what I argued for 
at the F2F. So I'll just repeat my rationale, particularly for the 
benefit of those people who weren't there.

For-each, choose, viewport (and group) are all special because if we
tried to describe them as components, we would need to pass in pipelines
as one or more of the inputs. I'm going to use pipeline="yes" to 
identify such inputs in the following discussion.

If we were to define them as components, this is what they'd look like.

<component name="group">
   <input name="do" pipeline="yes" />
   <input name="*" />
   <output name="*" />
</component>

<component name="fork">
   <input name="context" sequence="no" />
   <param name="test" required="yes" />
   <input name="do-true" pipeline="yes" />
   <input name="do-false" pipeline="yes" />
   <input name="*" />
   <output name="*" />
</component>

<component name="for-each">
   <input name="documents" sequence="yes" />
   <input name="do" pipeline="yes" />
   <input name="*" />
   <output name="*" />
</component>

<component name="viewport">
   <input name="document" sequence="no" />
   <param name="subtrees" required="yes" />
   <input name="do" pipeline="yes" />
   <input name="*" />
   <output name="result" sequence="no" />
</component>

Note that I've defined a 'fork' component rather than a 'choose': 
<choose> is actually a number of nested forks, and I'll address that later.

Also note that the 'viewport' component only has a single output (the 
original document with the selected subtrees replaced by the result of 
processing those subtrees). It's not clear to me what happens if 
processing the selected subtrees results in more than one output. Are 
those collated in the same way as the outputs from the for-each inner 
pipeline? thrown away?

If these were called as normal components are, this is what it'd look 
like. Note that I've left out the <pipe>s for the extra inputs: we've 
decided to use lexical scoping, so they're not going to be necessary.

<step kind="group">
   <pipe to="do">
     <pipeline>
       ... input/output/param declarations ...
       ... steps ...
     </pipeline>
   </pipe>
</step>

<step kind="fork">
   <pipe to="context" from="pipe!document" />
   <param name="test" value="//mytest" />
   <pipe to="do-true">
     <pipeline>
       ... input/output/param declarations ...
       ... steps ...
     </pipeline>
   </pipe>
   <pipe to="do-false">
     <pipeline>
       ... input/output/param declarations ...
       ... steps ...
     </pipeline>
   </pipe>
</step>

<step kind="for-each">
   <pipe to="documents" from="pipe!documents" select="//chap" />
   <pipe to="do">
     <pipeline>
       <input name="chapter" primary="yes" sequence="no" />
       ... other input/output/param declarations ...
       ... steps ...
     </pipeline>
   </pipe>
</step>

<step kind="viewport">
   <pipe to="document" from="pipe!document"
     select="/doc/xsl:stylesheet" />
   <param name="subtrees" value="/xsl:stylesheet/xsl:template" />
   <pipe to="do">
     <pipeline>
       <input name="template" primary="yes" sequence="no" />
       <output name="result" primary="yes" sequence="no">
         <pipe from="inner-step!result" />
       </output>
       ... other input/output/param declarations ...
       ... steps ...
     </pipeline>
   </pipe>
</step>

With both for-each and viewport, the definition of the component would 
have to state that the first/primary input declared for the 'do' 
pipeline is automatically bound to the individual document selected for 
processing. For for-each, the outputs of the 'do' pipeline become the 
outputs of the for-each. For viewport, the first/primary *output* 
declared for the 'do' pipeline is what gets inserted in place within the 
original document.

Now we perform the following transformation on the <step> calls:

1. the step kind becomes the element name
2. the name of the input/param becomes an element name
3. unnecessary <pipeline> elements are removed

This gives us:

<group>
   <do>
     ... input/output/param declarations ...
     ... steps ...
   </do>
</group>

<fork>
   <context from="pipe!document" />
   <test value="//mytest" />
   <do-true>
     ... input/output/param declarations ...
     ... steps ...
   </do-true>
   <do-false>
     ... input/output/param declarations ...
     ... steps ...
   </do-false>
</fork>

<for-each>
   <documents from="pipe!documents" select="//chap" />
   <do>
     <input name="chapter" primary="yes" sequence="no" />
     ... other input/output/param declarations ...
     ... steps ...
   </do>
</for-each>

<viewport>
   <document from="pipe!document" select="/doc/xsl:stylesheet" />
   <subtrees value="/xsl:stylesheet/xsl:template" />
   <do>
     <input name="template" primary="yes" sequence="no" />
     <output name="result" primary="yes" sequence="no">
       <pipe from="inner-step!result" />
     </output>
     ... other input/output/param declarations ...
     ... steps ...
   </do>
</viewport>

I'm moderately happy with this syntax. It's not particularly concise, 
but it's fairly clear. I'd be even happier if we continued the 
transformation with the following steps:

4. parameters are turned into attributes on the parent element
5. the attributes on the non-pipeline inputs are moved to the parent 
element (and the input element deleted)
6. where a single <do> element is left, the <do> element is removed

This would give us:

<group>
   ... input/output/param declarations ...
   ... steps ...
</group>

<fork from="pipe!document" test="//mytest">
   <do-true>
     ... input/output/param declarations ...
     ... steps ...
   </do-true>
   <do-false>
     ... input/output/param declarations ...
     ... steps ...
   </do-false>
</fork>

<for-each from="pipe!documents" select="//chap">
   <input name="chapter" primary="yes" sequence="no" />
   ... other input/output/param declarations ...
   ... steps ...
</for-each>

<viewport from="pipe!document" select="/doc/xsl:stylesheet"
           subtrees="/xsl:stylesheet/xsl:template">
   <input name="template" primary="yes" sequence="no" />
   <output name="result" primary="yes" sequence="no">
     <pipe from="inner-step!result" />
   </output>
   ... other input/output/param declarations ...
   ... steps ...
</viewport>

Note that I carefully chose a name for the 'subtrees' parameter on the 
viewport component that wouldn't clash with the attribute names we're 
using for selecting input documents.

Also note that this syntax specifically excludes the possibility of 
defining parameters dynamically or providing 'here' documents for the 
non-pipeline inputs of these components, neither of which I think we 
particularly want to support anyway.

The <choose> element that we want is a shorthand for nested <fork>s. If 
you have:

<fork from="A" test="X">
   <do-true>L</do-true>
   <do-false>
     <fork from="B" test="Y">
       <do-true>M</do-true>
       <do-false>N</do-false>
     </fork>
   </do-false>
</fork>

this gets transformed into:

<choose>
   <when from="A" test="X">L</when>
   <when from="B" test="Y">M</when>
   <otherwise>N</otherwise>
</choose>

and if A and B are the same, it can become:

<choose from="A">
   <when test="X">L</when>
   <when test="Y">M</when>
   <otherwise>N</otherwise>
</choose>

I'm flexible on most of the naming of inputs/params/outputs here, FWIW.

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Wednesday, 16 August 2006 14:12:29 UTC