Chameleon Component Summary & Proposal from Alex Milowski on 2007-02-15 (public-xml-processing-model-wg@w3.org from February 2007)

From: Alex Milowski <alex@milowski.org>
Date: Thu, 15 Feb 2007 15:57:32 -0800
To: public-xml-processing-model-wg@w3.org
Message-ID: <28d56ece0702151557y8bd67a4jdf757a88b88a241b@mail.gmail.com>
This is my summary... so correct me if I'm wrong.

We started this discussion because we needed to distinguish between
component configuration parameters and parameters to the component
itself.  In the former case (configuration parameters), these parameters are
often use to select the right underlying implementation technology
(e.g. XML Schema vs Relax or XSLT 1.0 vs 2.0).  The latter case, the
parameters are application parameters to particular component
technologies (e.g. parameters to a stylesheet).

The issue of configuration parameters brought up the label "chameleon"
because the underlying implementation might be choosing between
"actual" components (e.g. a Relax validation engine versus an XML Schema
processor).

In addition, these parameters are quite distinct in that configuration
parameters are allowed to cause the pipeline to fail to compile.  For
example, a configuration parameter of "version" with a value of "2.0"
to an XSLT component would require that the implementation support XSLT
2.0.  If it can't support XSLT 2.0, it can't run the transform and should
fail to compile the pipeline.

While application parameters might cause a pipeline to fail to run, they
are often dynamic errors as a result of running the component.  Handling
of such application error should be able to be caught and dealt with
by our try/catch construct.

In addition, we need a way to preserve our current open content models
for the pipeline language so that authors can add arbitrary documentation or
annotations and not have them be thought as component types.

On the call, several people voiced the idea that we want to be able to
specify namespaces as "ignorable" so that we can have annotations or
document elements and keep an open content model.  Basically, if an
element is not ignored it had better be one of the ones we define in our
specification or we must have a component definition associated with it.

Here's my proposal based on what I've heard so far:

   * An element in the pipeline document must be:
       * in our namespace where our specification defines the semantics
       * identified as a step element via a component definition
       * identified as ignoreable via the element's namespace.

   * Add an optional attribute on the [p:]pipeline element of
'ignore-prefixes'
     that has a list of namespace prefixes.   Each of those prefix
     identify elements that should be ignored while loading the pipeline
document.

   * By default, elements in the "no namespace" is ignored.

   * Ignored elements have no semantics and the pipeline processor should
     act the same as when given a document where those ignored elements
     are deleted.

  * Any component type can be written as a step as an element whose name
    is the component type (e.g. <p:xslt ...></p:xslt>)  This element has a
required
    attribute of 'name'  and whose content model is the same as the now
"abstract"
    element [p:]step.  I'll call this element the "component step element".


  * A configuration parameter is specified on the component step
    element by a simple typed attribute.  The value of that configuration
    parameter is a string value.

    Note: In the future, we could pass the simple type value.

    Note: We can't have a configuration parameter named 'name' that has a
different
    value than the name of the step.  Although, having the name accessible
as
    a configuration parameter will be very nice for debugging.

  * All application parameters are specified via the [p:]parameter element
as
    we specify in our current document (the status quo).

I believe that is the core of what we have talked about.

Considering appropriate defaulting of ports, this gives us steps like:

<p:xslt version="2.0">
   <p:input port="transform">
       <p:document href="stylesheet.xsl"/>
   </p:input>
</p:xslt>

where we're forcing an XSLT 2.0 implementation.

<p:xslt version="2.0" mode="toc">
   <p:input port="transform">
       <p:document href="stylesheet.xsl"/>
   </p:input>
</p:xslt>

where we're asking for initial mode 'toc' and so on.

For validation, we get:

<p:validation language="http://www.w3.org/2001/XMLSchema/v1.1">
    <p:input port="schema"> ... </p:input>
</p:validation>

Things we might consider:

  * allow definitions of configuration parameters in the component
definition.  This
    would be necessary to generate an appropriate schema type/element
declaration
    for the component step element.

  * allow configuration parameters to be child elements of the component
step
    elements.

That last point I find really interesting.  I think it would simplify
setting configuration
parameters that have large text values as well as allow for lists.

Here's my examples:

1. The "munge" component that fetches a resource and makes it
    available after converting it to base64 for application data or running
tidy on
    HTML.  Here I want to specify a set of mime types that it should accept:

   <my:munge name="get.resource">
      <my:mime-type>application/pdf</my:mime-type>
      <my:mime-type>image/jpeg</my:mime-type>
      <my:mime-type>text/html</my:mime-type>
   </my:munge>

   Here the component receives configuration parameter 'my:mime-type' that
has a
   value of a list of mime type values.

2. A simple ruby script component that needs the script to run.

   Currently, I could write:

   <p:step type="j:ruby" name="reverse.ABC">
      <p:param name="j:script" value="puts &quot;&lt;doc>&quot; +
&quot;ABC&quot;.reverse + &quot;&lt;/doc>&quot;"/>
    </p:step>

   this proposal makes this:

   <j:ruby name="reverse.ABC">
      <p:param name="j:script" value="puts &quot;&lt;doc>&quot; +
&quot;ABC&quot;.reverse + &quot;&lt;/doc>&quot;"/>
    </j:ruby>

   but I'd rather write:

   <j:ruby name="reverse.ABC">
      <j:script>
      <![CDATA[
         puts "<doc>" + "ABC".reverse + "</doc>"
      ]]>
      </j:script>
    </j:ruby>


   In all cases the component receives a configuration parameter of
'j:script" containing
   the ruby code to execute.

3. Specifying xquery pragmas and options

   <p:xquery>
      <pragma>(# exist:batch-transaction #)</pragma>
      <pragma>(# exist:timer #)</pragma>
      <option>exist:serialize "method=xhtmll"</option>
   </p:xquery>

-- 
--Alex Milowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics
Received on Thursday, 15 February 2007 23:57:38 UTC