Re: Implicit output ports and p:declare-step from Norman Walsh on 2009-06-24 (xproc-dev@w3.org from June 2009)

From: Norman Walsh <ndw@nwalsh.com>
Date: Wed, 24 Jun 2009 12:52:02 -0400
To: XProc Dev <xproc-dev@w3.org>
Message-ID: <m2bpodlglp.fsf@nwalsh.com>
"Henry S. Thompson" <ht@inf.ed.ac.uk> writes:
>> I feel the same. I lean towards saying that this does not apply to
>> p:declare-step, exactly because it is "an interface to the outside
>> world". And you don't really want to have pipelines having unnamed
>> implicit output ports, IMHO. ...althought at times, especially when you
>> want to be lazy and save some typing, thay may be convenient.
>
> When I use declare-step with a name and an explicit subpipeline,
> e.g. in a library, I certainly expect it to behave like any other
> compound step, I think. . .

Do you?

Consider:

  <p:library ...>
    <p:declare-step type="px:fred">
      <p:input port="source"/>
      <p:identity/>
    </p:declare-step>
  </p:library>

By the current rules, px:fred has one input port named "source" and
one output port...with no name.

Now suppose you want to store the result of px:fred in a file and
then continue processing it. You try to write it like this:

  <px:fred name="myproc"/>
  <p:store href="fred-output.xml"/>

  <p:identity>
    <p:input port="source">
      <p:pipe step="myproc" port="???"/>
    </p:input>
  </p:identity>

But you can't because there's nothing to put in "port".

I can see several options:

1. We say that the implicitly created port is named "result".

2. We say that it's your bug, you fix it. Either by making
   the port name explicit or by adding an identity step.

    <px:fred name="myproc"/>
    <p:identity name="myproc-identity"/>
    <p:store href="fred-output.xml"/>

    <p:identity>
      <p:input port="source">
        <p:pipe step="myproc-identity" port="result"/>
      </p:input>
    </p:identity>

3. We say that the rules about defaulting output ports don't apply to
   p:declare-step.

Choice 2 is the only one that requires no changes to the spec.

But now consider how implementations interface with pipelines. Mostly,
I think, implementors (and APIs) are going to want to use port names.
XML Calabash, for example, has -i/port/ and -o/port/ options for
binding inputs and outputs. As long as port names are uniformly
available, we can have a clean port-name based interface.

Now suppose you run this pipeline:

 <p:declare-step>
   <p:input port="source"/>
   <p:identity/>
 </p:declare-step>

How do you tell the XProc implementation or API that you want to do
something with the result? Maybe we say you can't. Or maybe we force
implementations and APIs to grow some new whatsit for dealing with
"the anonymous default output".

Neither of those is very satisfying to me.

Either of choice 1 or 3 from above would fix the problem.

I'm just not real happy with stamping arbitrary names on things.

> Hmm, the spec says the following:
>
>   If a primary output port is declared and that port has no binding,
>   then it is bound to the primary output port of the last step in the
>   subpipeline. It is a static error (err:XS0006) if the primary output
>   port has no binding and the last step in the subpipeline does not
>   have a primary output port.
>
> So this doesn't say anything at all about what happens if no POP is
> declared, although the static error discussion would appear to apply
> in that case as well?

You mist a bit in 2.3:

  Additionally, if a compound step has no declared outputs and the
  last step in its subpipeline has an unbound primary output, then an
  implicit primary output port will be added to the compound step (and
  consequently the last step's primary output will be bound to it).
  This implicit output port has no name. It inherits the sequence
  property of the port bound to it.

>> Personally, I think that p:declare-step should behave the same for both
>> atomic and compound steps. Saying that in the case of compound steps you
>> may get some magical output ports smells to me. I prefer steps with an
>> obvious signature.
>
> Hmm, it makes sense to me.  We really have two distinct
> functionalities:
>   1) define step (implementation explicit);
>   2) declare step signature (implementation unspecified)
> for which we use the same tag.  I think it makes sense that the
> defaulting rules go with the function, and so are different.
>
> I still don't see the downside wrt case (1) when we go even further
> and mean "and run it", i.e. when the declare-step is the entire
> content of a pipeline document.

Do my concerns above make sense?

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | Many who find the day too long, think
http://nwalsh.com/            | life too short.--Charles Caleb Colton
Received on Wednesday, 24 June 2009 16:52:50 UTC