- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Wed, 24 May 2006 21:38:54 +0100
- To: public-xml-processing-model-wg@w3.org
Hi Norm, Norm Walsh wrote: > / Jeni Tennison <jeni@jenitennison.com> was heard to say: > | Norm Walsh wrote: > |> / Jeni Tennison <jeni@jenitennison.com> was heard to say: > |> | A more flexible alternative would be to say that labelled documents are > |> | referencable as variables within the XPath expressions used to set parameters > |> | or variables. > |> > |> Yes, but it puts variables/parameters and input/output labels all into > |> the same "symbol space" which worries me a bit. > | > | It doesn't worry me. I think we want parameters and I/O labels to be in > | the same symbol space anyway so that we can support a directed syntax > | should we want to in the future. > > I don't see the connection there... If parameter/variable names are in a different symbol space from inputs/outputs then we are allowing a situation where a parameter can have the same name as an input. For example: <p:step name="my:process"> <p:input name="document" href="doc.xml" /> <p:param name="document" value="yes" /> <p:output name="result" label="out" /> </p:step> If you translated this to a directed syntax, you'd run into problems because an attribute or element called 'document' could refer to either the input or the parameter. You obviously can't have: <p:process document="doc.xml" document="yes" result="out" /> so you'd have to have something like: <p:process input.document="doc.xml" param.document="yes" output.result="out" /> I think that, in order to keep our options open for using a directed syntax in the future, and to enable users to easily create their own directed syntax that they can easily translate into our generic syntax, the names of parameters and inputs/outputs should share a symbol space. > |> | This is more flexible because it means that you can refer to more than one > |> | document within the XPath expression. > |> > |> Indeed. Is that valuable enough to justify the added complexity? > | > | I think it's simpler. The explanation goes: > | > | The select attribute of p:param, p:variable, p:step/p:input and > | p:pipeline/p:output holds an XPath expression that provides the value > | of the parameter, variable, input or output. The value of a parameter > | must be a string; it is set to the string value of the result of > | evaluating the XPath expression. The value of an input or output must > | be a node set containing only root (document) nodes [1]; it is an > | error if the XPath evaluates to anything else. > > I'm having a hard time getting my head around using select on > p:pipeline/p:output, but I think I get it. It's funny, I found it hard to get my head around p:pipeline/p:output being referenced by the final p:step/p:output. As Richard pointed out some time ago, the links between the ports could be represented in either direction. I think it makes more sense for *all* the step outputs to be referenced (thus having the pipeline output doing the referencing) rather than having some step outputs being referenced and some doing the referencing. I think it makes it easier to add steps at the end of the pipeline, and to create pipelines where an output is both a final output and an intermediate output. > | When evaluating an XPath expression, the context node and the context > | position are undefined: it is an error if the expression references > | them [2]. The variable bindings for the expression are determined by > | variable binding elements that precede the expression. These are: > | > | - p:pipeline/p:input binds the variable with the name specified in the > | name attribute to a node set containing the root (document) nodes > | passed as that input. > | > | - p:pipeline/p:param binds the variable with the name specified in the > | name attribute to the (string) value passed as the value of the > | parameter, or to the string value of the result of evaluating the > | XPath in the select attribute if no value is passed for the > | parameter. > > Does anyone have any misgivings about requiring that parameters be strings? > Specifically, that they may not be documents? To be honest, the distinction between inputs and parameters has always seemed a bit weird to me: they're both pieces of information passed to the component. At the moment, the only distinction between them seems to be the type of value they can take (inputs are sequences of documents, parameters are strings), and I can live with that, or with parameters being able to take other atomic values as well. If parameters could be documents then I'd be left wondering what the difference was between an input document and a parameter document? Are there additional restrictions, such as parameter documents being static (not generated by the pipeline)? Or perhaps parameters can be left unset (and have a default) whereas inputs can't? > | [2] I think we'll want to set the context node and context position > | differently within a <p:for-each>. > > Maybe, but that's not obvious to me. I had in mind that for-each would > bind an input to the first document in its input sequence and run the > steps it contains with that input. Then it would bind the input to the > second document in its input sequence and run the steps again. I didn't > expect to have XPath expressions referring directly to the current > input document inside for-each any more than elsewhere in the > pipeline. > > I had in mind something like this: > > <p:step name="xslt"> > ... > <p:output name="result" label="styled-docs"/> > </p:step> > > <p:for-each ref="styled-docs"> > <p:input name="document" label="doc"/> > <p:output name="result" label="result"/> > > <p:step name="tidy"> > <p:input name="document" ref="doc"/> > <p:output name="result" label="result"/> > </p:step> > </p:for-each> I'd be happy with that too, although I do think that having to declare the inputs/outputs of <p:for-each> (and the other flow-control elements) is a bit tedious. Presumably with the above syntax it's always the first input that gets bound to the individual documents in the selected document sequence? And presumably the outputs get set to the concatenation of the outputs they reference? Cheers, Jeni -- Jeni Tennison http://www.jenitennison.com
Received on Wednesday, 24 May 2006 20:39:12 UTC