- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Wed, 23 Aug 2006 15:57:40 +0100
- To: Norman Walsh <Norman.Walsh@Sun.COM>
- Cc: public-xml-processing-model-wg@w3.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Great work!
Comments follow, of varying degrees of seriousness. . .
Figure 2. A transform and serialize pipeline
I _think_ this raises too many questions to come as only the second
example. I at least was quite baffled/worried at first, that a
'Serialize' step was going to be necessary to get XML documents out
of pipelines. Then I realised you had included it because of the
'all choose branches have same output port configuration'
constraint, but we _really_ don't want to go in to that at this
point in the doc't, do we?
Why not use the test="/my:root/@version < 1.2" schema validate
example at this point?
2.2 Inputs and Outputs
I think we need to say here why it's _not_ a static error if an
output is connected to an input of with a different declared
cardinality, i.e. to explicitly explain that we decided it was ok to
connect a sequence-out to a singlelton-in, and only complain if the
sequence-out failed to produce exactly one document.
2.3 Parameters
How can a value other than a string "[be] given"? Did we decide
that parameters are specified with XPaths? If so, surely that
should be said here.
2.4 Component graph
"The inputs and outputs . . . _are_ the arcs of that graph"
[emphasis added]? Surely, as in the immediately following
definition, "... are connected by the arcs" is what is wanted?
3 Language Constructs
I'd prefer to have "for-each construct", "viewport construct", etc.,
rather than "for-each component".
3.1 Pipeline
I find the first sentence pretty baffling. . .
3.2 For-Each
Needs some brief motivation, I think, along the lines of
"In cases where a component or sub-pipeline requires a single
document input, but a pipeline needs to process a sequence of
documents with that component, the for-each construct can be used."
The term 'aggregation' is nowhere defined. I think nothing is lost,
and indeed we're better off, if the definition reads:
The result of the for-each is a sequence of the documents produced
by processing each individual document in the input sequence. If
the for-each subpipeline declares multiple outputs, each output is
a sequence of the documents produced on that output by each
iteration.
3.4 Choose
Paras 1 and 5 seem to contradict each other wrt the presence of a
default.
3.5 Try/Catch
That word 'aggregation' again :-)
4 Syntax
I'm OK with using 'instantiate' to describe the relationship between
components and steps (although I'm still no sure about using
'component' for both type and token throughout the first three
sections), but I would much prefer to talk about 'representing' or
'encoding' a pipeline. . . Also in 4.2 Pipeline Vocabulary
4.1.1 Specified by URI
Have we decided whether the schema type of the *href* attribute is
xs:anyURI or (list of xs:anyURI)? I _think_ I see no reason not to
support the latter. Makes the validate component much simpler -- I
just write
<p:input port="schema"
href="http://www.example.com/myvocab
http://www.w3.org/2001/06/soap-envelope.xsd"/>
4.1.1 Specified by source
The word 'ancestor' is not defined, or immediately obvious -- how
about
". . . must either be declared on some ancestor (e.g. an enclosing
_choose_ or _for-each_) or it must be. . ."
4.1.1 Specified by here document
More than one (non-document) child == sequence allowed?
4.1.2 Editorial Note
Well, we did have step is instantiation of component, in turn
described by component declaration.
We could have component for both type _and_ token, which is what you
seemed to be going for in section 3, with p:component-declaration
describing the type and p:component corresponding to an instance.
But p:step is so nice and short . . .
4.1.3 Syntactic shortcuts
Arghh! Now we're calling choose a _user-defined_ component. Surely
not. Stick with 'construct', please!
[note here and elsewhere you haven't made up your mind wrt p:param
vs. p:parameter -- I vote for p:(declare-)parameter, because we're
going in the opposite direction from xslt, i.e. if we used p:param,
we'd have the following confusing paradigm:
p:declare-param is to p:param as xsl:param is to xsl:with-param]
4.2.1 p:pipeline Element
[I'm only going to say this once :-]
I'd much prefer
"A p:pipeline represents a _pipeline_. Its children represent
declarations of the inputs, outputs and parameters that the
pipeline exposes and the _subpipeline_ that constitutes
its definition."
4.2.8 p:for-each Element
The term 'aggregate' is nowhere defined, and I find it a bit opaque
at best and misleading at worst. How about replacing the last _two_
sentences before the example with
For each declared output, the processor will collect all the
documents that are produced for that output from all the
iterations, in order, into a sequence.
[not done, but I'm sending this now and will add more later]
ht
- --
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (GNU/Linux)
iD8DBQFE7GzlkjnJixAXWBoRAvj3AJ9DdrPiqzuKLbjnqiLC9WgejidnZwCeKXGM
KMFPeaXMHFAkP5eVMeRCnxg=
=xsyo
-----END PGP SIGNATURE-----
Received on Wednesday, 23 August 2006 14:58:52 UTC