parameters and pipelines (revised)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We've got several gaps in this area at the moment.

1) The current draft doesn't say anything about how you get access to
   parameters passed to pipelines 'from outside';

2) The current draft doesn't say anything about how you get access to
   parameters passed to pipelines when invoking them by name;

3) There is an ednote in the current draft [1] raising the question of
   whether steps which declare no parameter inputs get the pipeline
   parameters by default.

Here are some suggestions about how we deal with this, based in part
on IRC discussion with Norm (but don't assume he agrees with any of
this):

1) Sets of parameter name/value pairs may be externally specified in
   two ways, which I'll call 'named' and 'anonymous':

   named: a named collection of name/value pairs
   anonymous: an un-named collection of name/value pairs

   For example

 > xproc -params xslt1 x=1 y=1 -params xslt2 x=2 y=2 -params debug=no pipe.xpl

   might be one way to produce two named sets and one anonymous set

   p:pipeline allows parameter inputs (a parameter input is a static
   error on any other container).

   It's a dynamic error if a named parameter set is specified
   externally for a pipeline which does not have a parameter input of
   the same name.  An unnamed parameter set is always allowed.

   Steps can access named parameter sets in the obvious way, e.g.

     <p:input port='[some param port]'>
      <p:pipe step='pipe' port='xslt1'/>
     </p:input>

   Pipelines with _no_ parameter inputs declared get an anonymous one
   for free, and implementations MUST make the anonymous set available
   via it in this case.

   It is implementation-defined what happens to the anonymous set in
   the presence of explicit parameter input declarations (e.g. they go
   to first one, last one, it's an error).

   Steps can access the unnamed parameter set using defaulting:

     <p:input port='[some param port]'/>

   [As for ordinary inputs, it's a static error to attempt to bind to
    an undeclared parameter port.]

   It's a static error to include p:parameter as the child of a step
   unless it has exactly one declared parameter port.

2) A pipeline invoked by name gets its parameters in an analogous way:

   2a) Iff the pipeline was defined with one or more parameter input
       ports, the invocation may include a binding for those ports:

       <p:pipeline type='my:pipe'>
        . . .
        <p:input port='xslt1' kind='parameter'/>
        . . .
       </p:pipeline>

       <my:pipe>
        . . .
        <p:input port='xslt1'>
         <p:inline>
          <c:parameter name='x' value='3'/>
         </p:inline>
        </p:input>
        . . .
       </my:pipe>

   2b) Explicit use of p:parameter in a named pipeline invocation
        i) is a static error if the pipeline declares more than one
           parameter input;
        ii) feeds parameters into a declared parameter input if the 
            pipeline declares exactly one;
        iii) feeds parameters into an anonymous parameter input otherwise.

       <p:pipeline type='my:pipe'>
         <p:xquery>
          <p:input port='parameters'/>
          . . .
         </p:xquery>
         . . .
        </p:pipeline>

        <my:pipe>
         <p:parameter name='forXQuery' value='17'/>
         . . .
        </my:pipe>

    2c) A pipeline which declares no parameter input ports 'inherits'
        the anonymous parameter set when invoked by name.  Parameters
        specified in this way shadow those coming via 2b.iii above,
        i.e. it treats the explicit ones as overridable defaults.

   3) If you want to combine the anonymous set with explicit
      parameters, use the p:parameters step:

      <p:parameters name="mixin">
       <p:input port='parameters'/>
       <p:parameter name="foo" value="baz"/>
      </p:parameters>
      . . .
      <p:xslt>
       <p:input port="parameters">
         <p:pipe step="mixin" port="result"/>
       </p:input>
      <p:xslt>

      Combining the anonymous set with named sets is a sin, but you
      can do it using _two_ p:parameters steps.

Open questions:

 A) Should <p:input kind='parameter' .../> as a child of p:pipeline be
    purely a declaration, i.e. be necessarily empty, or should we
    allow it to have content, in which case how do we treat that
    content -- merge it with external input, ignore it if there's any
    external input, . . .?

 B) There's a covert assumption in the current spec., unchanged by any
    of the above, that the API from the runtime to step
    implementations will have a way of accessing parameters.  Since
    parameters are declared, this access could take port name as an
    argument, or it could just be undifferentiated as to port name,
    that is, it's just "give me all the parameter bindings you have
    for this instance of this step".  I don't suppose we _have_ to say
    anything about this, but we could choose to say e.g. that
    implementations _should_ provide access by port name, or at least
    indicate what port particular parameter settings arrived via. . .

    As long as we allow more than one parameter port per step (and I
    think we should), I have some inclination to encourage the
    provision of access to them by port name.

 C) Is the shadowing specified in (2c) above the right way around?  I
    think it is, noting that if you _really_ want to override the
    values coming 'from outside', you can do so on any step which
    accesses the anonymous set.

Finally question (3) above becomes, in the terms of the proposal in
(1) and (2) above, "Under what circumstances should the runtime
deliver the anonymous parameter set when a step implementation asks
for its parameters?"

    Possible answers:
      a) Always;
      b) Only if a parameter port has been explicitly bound to it (as
         in the final example under (1) above) (and that port is asked
         for);
      c) If a parameter port has been explicitly bound to it (and is
         asked for), or if some parameter ports have not been bound at
         all (and all parameters are asked for? and any unbound
         parameter port is asked for? and there is only one parameter
         port declared and unbound?).

    I believe Alessandro favours (b), and I favour (c).  The
    difference is manifested in the case of a simple p:xslt step -- if
    the pipeline is minimal, i.e.

     <p:pipeline>
      <p:xslt>
       <p:input port="transform">
        <p:document href="..."/>
       </p:input>
      </p:xslt>
     </p:pipeline>

     and I invoke it with some parameters (either 'from outside', or
     by invoking it by name), does the XSLT step see the parameters?
     I think users will expect it to, and I think they're right.

     On proposal (c) above, if you want to _protect_ a step from
     parameters, you would write e.g.

     <p:pipeline>
      <p:xslt>
       <p:input port='parameters'>
        <p:empty/>
       </p:input>
      </p:xslt>
     </p:pipeline>

     thereby forestalling the delivery of the anonymous set.

ht

[1] http://www.w3.org/XML/XProc/docs/WD-xproc-20070706/#default-params
- -- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFGjoUNkjnJixAXWBoRAlDxAJ9CC1VxXKMAJyDmxxJcMws2JCzYwQCfaYoh
g6oFWsekRrtezHaQNgO3iJM=
=MLlz
-----END PGP SIGNATURE-----

Received on Friday, 6 July 2007 18:08:23 UTC