- From: Norman Walsh <ndw@nwalsh.com>
- Date: Mon, 09 Nov 2009 08:40:23 -0500
- To: public-xml-processing-model-comments@w3.org
- Message-ID: <m2d43rerzs.fsf@nwalsh.com>
"Vasil Rangelov" <boen.robot@gmail.com> writes: > Hmmm... so... let me get this straight... when p:declare-step has a sub > pipeline, it declares a pipeline, and it is said that a user really "calls a > pipeline". When p:declare-step doesn't have a sub pipeline, it declares an > "atomic extension step" and the user "calls an atomic extension step". I think it's less complicated than that. In this context, whether something is an extension step or not is irrelevant, so let's set that to one side for a moment. A p:declare-step declares a step. If that declaration includes a body, then the body is the definition of the step. If the declaration is empty, then the definition of the step is known to the processor through some other means. In either case, the declaration declares a type of step. If a user uses places an element whose QName is the same as the name of that type of step in a subpipeline, then the processing defined by that step is performed. So: <p:declare-step type="ex:my-first-step"> <p:input port="source"/> <p:output port="result"/> </p:declare-step> declares a step (that happens to be atomic) of the type "ex:my-first-step". <p:declare-step type="ex:my-second-step"> <p:input port="source"/> <p:output port="result"/> <p:identity/> </p:declare-step> declares a step (that happens not to be atomic) of type type "ex:my-second-step". I can use these steps in my pipelines without concern for whether they are atomic or not, their *use* is *always* "atomic" in the sense that it never contains a body: <p:pipeline> <ex:my-first-step/> <ex:my-second-step/> </p:pipeline> > Rules > that apply to atomic (extension or built in) steps don't apply to pipelines, > and vice-versa... The only difference with respect to connections is that in the declaration of a step that is not atomic, the author can provide bindings for the inputs and outputs. In the case of inputs, these connections are used if no explicit or implicit binding is provided. In the case of outputs, these connections define what the output of the step will be when it's called. Consider: <p:declare-step type="ex:my-third-step"> <p:input port="source" primary="true"> <p:inline><doc1/></p:inline> </p:input> <p:input port="secondary" primary="false"> <p:inline><doc2/></p:inline> </p:input> <p:output port="result" primary="true"/> <p:output port="result2" primary="false"> <p:inline><doc3/></p:inline> </p:input> <p:identity/> </p:declare-step. And here's where it's used: ... <p:identity/> <ex:my-third-step/> When ex:my-third-step runs, its primary input port "source" will be bound to the default readable port ("result" from the identity step), so the default input "<doc1/>" will not be used. There isn't a binding for the "secondary" input, so it will be bound to "<doc2/>". The "result" output has no binding, so it will be bound to the default readable port of the last step, another identity step in this case. The "result2" output will be bound to "<doc3/>". If you did this: <p:declare-step type="ex:my-third-step"> <p:input port="source" primary="true"> <p:inline><doc1/></p:inline> </p:input> <p:input port="secondary" primary="false"> <p:inline><doc2/></p:inline> </p:input> <p:output port="result" primary="true"> <p:inline><doc4/></p:inline> </p:output> <p:output port="result2" primary="false"> <p:inline><doc3/></p:inline> </p:input> <p:identity/> </p:declare-step> You'd get an error because the primary output of the last step is unconnected. You coudld do this: <p:declare-step type="ex:my-third-step"> <p:input port="source" primary="true"> <p:inline><doc1/></p:inline> </p:input> <p:input port="secondary" primary="false"> <p:inline><doc2/></p:inline> </p:input> <p:output port="result" primary="true"> <p:inline><doc4/></p:inline> </p:output> <p:output port="result2" primary="false"> <p:inline><doc3/></p:inline> </p:input> <p:identity/> <p:sink/> </p:declare-step> But that's a pretty pointless pipeline. If you're declaring a step that isn't atomic, you can't put any sort of default bindings in the inputs or outputs because you don't have any visibility into what the step does. > but aren't there some rules that apply to both atomic > steps and pipelines? If so, is there a common term to refer to them both (I > don't remember ever seeing a phrase like "atomic steps or pipelines")? Where > do "atomic extension steps" fit into this? Do they get the rules for > pipelines or for "atomic steps" (standard library wise)? The spec currently > appears to first define them separately (as if they are something truly > "special"), and then goes on by using only the term "atomic steps". All atomic steps are the same. The only things that are special about atomic steps in the XProc namespace is that you can't declare any and if you specify version > 1.0, some otherwise static errors are ignored. In all other respects, they're just ordinary steps. >> For atomic steps, there is no default, the step produces what it produces. >> It's a black box. > > Right... for atomic steps in the standard library and p:declare-step of an > (extension) atomic step. And for p:declare-step of a pipeline? What happens > if there is p:output with no connection in that case? Is THAT an error, or > do you always get some kind of a default connection (like p:empty)? An unconnected, non-primary output port in the declaration of a non-atomic step produces an empty sequence. I don't think of that in terms of having a default connection to p:empty so much as simply not having anything to read *from*. Remember, somewhat counter-intuitively, that *inside* the declaration of a non-atomic step, the connections inside p:outputs are *reading* data, not writing it. Whatever they *read* gets *written to* the output that's seen by the pipeline that invokes the declared step. <p:pipeline> ... <p:declare-step type="ex:foo"> <p:output port="result"> <p:pipe step="whatever" port="result"/> </p:output> <ex:something name="whatever"/> </p:declare-step> <ex:foo/> <p:identity/> </p:pipeline> When "ex:foo" is invoked, the processor begins running the declared subpipline. The ex:something step is run. The "result" output of ex:something is *read by* the connection declared in the ex:foo declaration and *written to* the "result" output port that's seen by the p:identity step that follows the call to ex:foo. Output ports in non-atomic steps have this weird dual role that they read stuff (usually but not necessarily) from other steps in the subpipeline of their container and write stuff to the outputs of that container. >> For compound steps, there is no default, but if you leave them unconnected > then they will produce an empty sequence. > > If you leave them unconnected (i.e. don't specify anything as a connection), > isn't what they get called a "default" connection (p:empty in this case)? If > so, aren't you contradicting yourself? Anyway, I see... p:empty it is. The result is the same as if they were bound to p:empty, so I'm not sure it much matters how we think about it. I think in terms of the dual role described above. If there's no connection, then there's nothing for the output to read from. If there's nothing for it to read from, then it has nothing to write to the output port. Be seeing you, norm -- Norman Walsh <ndw@nwalsh.com> | So, are you working on finding that bug http://nwalsh.com/ | now, or are you leaving it until later? | Yes.
Received on Monday, 9 November 2009 13:41:13 UTC