Re: Pipeline proposal

/ Jeni Tennison <jeni@jenitennison.com> was heard to say:
| There seems to be a growing consensus within the group, reflected
| here, that inputs and outputs accept/produce *documents* and can't

Yes, I think we have consensus on that point.

| the back of our minds that to fully support XSLT and XQuery we really
| need to pass document fragments and external parsed entities between
| components.

I'm not sure we really need to be able to do that. It seems to me that
the pipeline author can wrap an additional dummy element around the
fragments as they flow through the pipeline and strip it off at the
end. I'm happy to keep the idea of doing something more efficient in
mind, but I don't think the requirement to pass documents around
imposes any insurmountable obstacles.

| You talk about declaring the cardinality of the inputs/outputs a
| component accepts/produces. Another thought, perhaps for a future
| version, would be to provide more detail about what's
| expected/generated by a particular component: perhaps giving the name
| of the document element or even an entire schema that the documents
| adhere to. I'd like to see us allow for that kind of extension if we
| don't support it in this version.

Knowing statically that a component expects or produces a certain kind
of document might be useful, but I think I'd like to see that done as
an extension in V1. Knowing (or asserting) dynamically what's expected
or produced is just a shorthand for inserting validator or other
components before or after the component, isn't it? I'm not opposed to
authoring convenience, but if it gets ugly to specify, I think we can
live without it.

I, for one, am disappointed that XSLT can only deal with W3C XML
Schema validation and would be sorely disappointed if we found
ourselves in the same position. I can imagine wanting to check against
a RELAX NG grammar or a Schematron grammar as easily as a W3C XML
Schema grammar.

| I'm concerned about how we define parameters. In particular, I'm
| worried about the XSLT case where one of the parameters for the
| component is a set of QName/value pairs (the XSLT parameters). I guess
| that in this proposal, you'd do that with a parameter called
| 'parameters' whose value was a formatted string such as
| '{uri1}local1=value1; {uri2}local2=value2', or we'd say that the XSLT
| parameters were encoded in an XML document and passed as an *input*.

I was thinking that those would *be* the parameters:

  <step name="xslt">
    <param name="x:y" value="'xxx'"/>
    <param name="a:b" value="'yyy'"/>
    <param name="foo" value="$foo"/>
  </step>

But that does require that the pipeline know about the names of the
parameters. If you want completely dynamic parameters, I think you
might have to stuff them all in a configuration document and tweak
your stylesheet to read that.

| You say: "Except as described in conditionals, all components in a
| pipeline are run (in particular, they do not get run only if input
| arrives or output is requested)." I'm not sure of your intention here.

I think Richard was just trying to avoid requiring implementations to
do any sort of backwards or forwards chaining to determine what
components get run.

| I'm worried that this constraint prevents implementations from caching
| and reusing intermediate documents (if they can detect that the
| information that led to the generation of those documents hasn't
| changed). Perhaps we need to look at the question of whether
| components can have side-effects to work out whether this is important
| or not.

I think we shold assert that components are side-effect free. Given
the same inputs, they produce the same outputs and you can't tell if
the implementation did the computation twice or cached the results.

| That's enough for now.

No! More! More! :-)

                                        Be seeing you,
                                          norm

-- 
Norman.Walsh@Sun.COM / XML Standards Architect / Sun Microsystems, Inc.
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.

Received on Wednesday, 5 April 2006 18:56:02 UTC