- From: Vasil Rangelov <boen.robot@gmail.com>
- Date: Sun, 29 Nov 2009 22:52:19 +0200
- To: <public-xml-processing-model-comments@w3.org>
> The only difference with respect to connections is that in the > declaration of a step that is not atomic, the author can provide > bindings for the inputs and outputs. What is the term for "a step that is not atomic" in this context? It certainly isn't "a contained step"... XProc doesn't provide a facility for that. "a pipeline"? This is the thing that creates most confusion I think. The very fact that "a step that is atomic" and "a step that is not atomic" have certain common rules, and also have rules that apply to one, but not the other. It isn't clear (in the spec) which applies to which, as very often, the spec seems to use the term "atomic step" to refer to all kinds of atomic steps. I guess the very definition of an atomic step is misleading: [Definition: An atomic step is a step that performs a unit of XML processing, such as XInclude or transformation, and has no internal subpipeline. ] By this logic, regardless of whether a step is specification, implementation or user defined, it's atomic if it doesn't have a subpipeline when it's called. Or perhaps it's a difference between "declaration of atomic step" and "calling of an atomic step"... you may be declaring a pipeline, but you're calling an atomic step... this difference shouldn't exist. One should be calling what one declares. I would like to point out another example where this creates a problem, but for the most part, there isn't a problem if standard, extension and user defined atomic steps are treated in the same fashion, and therefore, if a single term is used to refer to all of them. err:XS0029 is the only example I have so far for a case where this creates confusion and a potential for misinterpretation... I mean, I now know what was the intention, but that doesn't make the formulation OK. Regards, Vasil Rangelov -----Original Message----- From: public-xml-processing-model-comments-request@w3.org [mailto:public-xml-processing-model-comments-request@w3.org] On Behalf Of Norman Walsh Sent: Monday, November 09, 2009 3:40 PM To: public-xml-processing-model-comments@w3.org Subject: Re: p:output and connections "Vasil Rangelov" <boen.robot@gmail.com> writes: > Hmmm... so... let me get this straight... when p:declare-step has a > sub pipeline, it declares a pipeline, and it is said that a user > really "calls a pipeline". When p:declare-step doesn't have a sub > pipeline, it declares an "atomic extension step" and the user "calls an atomic extension step". I think it's less complicated than that. In this context, whether something is an extension step or not is irrelevant, so let's set that to one side for a moment. A p:declare-step declares a step. If that declaration includes a body, then the body is the definition of the step. If the declaration is empty, then the definition of the step is known to the processor through some other means. In either case, the declaration declares a type of step. If a user uses places an element whose QName is the same as the name of that type of step in a subpipeline, then the processing defined by that step is performed. So: <p:declare-step type="ex:my-first-step"> <p:input port="source"/> <p:output port="result"/> </p:declare-step> declares a step (that happens to be atomic) of the type "ex:my-first-step". <p:declare-step type="ex:my-second-step"> <p:input port="source"/> <p:output port="result"/> <p:identity/> </p:declare-step> declares a step (that happens not to be atomic) of type type "ex:my-second-step". I can use these steps in my pipelines without concern for whether they are atomic or not, their *use* is *always* "atomic" in the sense that it never contains a body: <p:pipeline> <ex:my-first-step/> <ex:my-second-step/> </p:pipeline> > Rules > that apply to atomic (extension or built in) steps don't apply to > pipelines, and vice-versa... The only difference with respect to connections is that in the declaration of a step that is not atomic, the author can provide bindings for the inputs and outputs. In the case of inputs, these connections are used if no explicit or implicit binding is provided. In the case of outputs, these connections define what the output of the step will be when it's called. Consider: <p:declare-step type="ex:my-third-step"> <p:input port="source" primary="true"> <p:inline><doc1/></p:inline> </p:input> <p:input port="secondary" primary="false"> <p:inline><doc2/></p:inline> </p:input> <p:output port="result" primary="true"/> <p:output port="result2" primary="false"> <p:inline><doc3/></p:inline> </p:input> <p:identity/> </p:declare-step. And here's where it's used: ... <p:identity/> <ex:my-third-step/> When ex:my-third-step runs, its primary input port "source" will be bound to the default readable port ("result" from the identity step), so the default input "<doc1/>" will not be used. There isn't a binding for the "secondary" input, so it will be bound to "<doc2/>". The "result" output has no binding, so it will be bound to the default readable port of the last step, another identity step in this case. The "result2" output will be bound to "<doc3/>". If you did this: <p:declare-step type="ex:my-third-step"> <p:input port="source" primary="true"> <p:inline><doc1/></p:inline> </p:input> <p:input port="secondary" primary="false"> <p:inline><doc2/></p:inline> </p:input> <p:output port="result" primary="true"> <p:inline><doc4/></p:inline> </p:output> <p:output port="result2" primary="false"> <p:inline><doc3/></p:inline> </p:input> <p:identity/> </p:declare-step> You'd get an error because the primary output of the last step is unconnected. You coudld do this: <p:declare-step type="ex:my-third-step"> <p:input port="source" primary="true"> <p:inline><doc1/></p:inline> </p:input> <p:input port="secondary" primary="false"> <p:inline><doc2/></p:inline> </p:input> <p:output port="result" primary="true"> <p:inline><doc4/></p:inline> </p:output> <p:output port="result2" primary="false"> <p:inline><doc3/></p:inline> </p:input> <p:identity/> <p:sink/> </p:declare-step> But that's a pretty pointless pipeline. If you're declaring a step that isn't atomic, you can't put any sort of default bindings in the inputs or outputs because you don't have any visibility into what the step does. > but aren't there some rules that apply to both atomic steps and > pipelines? If so, is there a common term to refer to them both (I > don't remember ever seeing a phrase like "atomic steps or pipelines")? > Where do "atomic extension steps" fit into this? Do they get the rules > for pipelines or for "atomic steps" (standard library wise)? The spec > currently appears to first define them separately (as if they are > something truly "special"), and then goes on by using only the term "atomic steps". All atomic steps are the same. The only things that are special about atomic steps in the XProc namespace is that you can't declare any and if you specify version > 1.0, some otherwise static errors are ignored. In all other respects, they're just ordinary steps. >> For atomic steps, there is no default, the step produces what it produces. >> It's a black box. > > Right... for atomic steps in the standard library and p:declare-step > of an > (extension) atomic step. And for p:declare-step of a pipeline? What > happens if there is p:output with no connection in that case? Is THAT > an error, or do you always get some kind of a default connection (like p:empty)? An unconnected, non-primary output port in the declaration of a non-atomic step produces an empty sequence. I don't think of that in terms of having a default connection to p:empty so much as simply not having anything to read *from*. Remember, somewhat counter-intuitively, that *inside* the declaration of a non-atomic step, the connections inside p:outputs are *reading* data, not writing it. Whatever they *read* gets *written to* the output that's seen by the pipeline that invokes the declared step. <p:pipeline> ... <p:declare-step type="ex:foo"> <p:output port="result"> <p:pipe step="whatever" port="result"/> </p:output> <ex:something name="whatever"/> </p:declare-step> <ex:foo/> <p:identity/> </p:pipeline> When "ex:foo" is invoked, the processor begins running the declared subpipline. The ex:something step is run. The "result" output of ex:something is *read by* the connection declared in the ex:foo declaration and *written to* the "result" output port that's seen by the p:identity step that follows the call to ex:foo. Output ports in non-atomic steps have this weird dual role that they read stuff (usually but not necessarily) from other steps in the subpipeline of their container and write stuff to the outputs of that container. >> For compound steps, there is no default, but if you leave them >> unconnected > then they will produce an empty sequence. > > If you leave them unconnected (i.e. don't specify anything as a > connection), isn't what they get called a "default" connection > (p:empty in this case)? If so, aren't you contradicting yourself? Anyway, I see... p:empty it is. The result is the same as if they were bound to p:empty, so I'm not sure it much matters how we think about it. I think in terms of the dual role described above. If there's no connection, then there's nothing for the output to read from. If there's nothing for it to read from, then it has nothing to write to the output port. Be seeing you, norm -- Norman Walsh <ndw@nwalsh.com> | So, are you working on finding that bug http://nwalsh.com/ | now, or are you leaving it until later? | Yes.
Received on Sunday, 29 November 2009 20:53:55 UTC