W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > September 2006

Re: "Feature complete" XProc draft

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Mon, 11 Sep 2006 12:37:32 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <87hcze9x6b.fsf@nwalsh.com>
/ ht@inf.ed.ac.uk (Henry S. Thompson) was heard to say:
| Figure 2. A transform and serialize pipeline
|   I _think_ this raises too many questions to come as only the second
|   example.  I at least was quite baffled/worried at first, that a
|   'Serialize' step was going to be necessary to get XML documents out
|   of pipelines.  Then I realised you had included it because of the
|   'all choose branches have same output port configuration'
|   constraint, but we _really_ don't want to go in to that at this
|   point in the doc't, do we?
|   Why not use the test="/my:root/@version < 1.2" schema validate
|   example at this point?


| 2.2 Inputs and Outputs
|   I think we need to say here why it's _not_ a static error if an
|   output is connected to an input of with a different declared
|   cardinality, i.e. to explicitly explain that we decided it was ok to
|   connect a sequence-out to a singlelton-in, and only complain if the
|   sequence-out failed to produce exactly one document.  


| 2.3 Parameters
|   How can a value other than a string "[be] given"?  Did we decide
|   that parameters are specified with XPaths?  If so, surely that
|   should be said here.

Fixed, I think.

| 2.4 Component graph
|   "The inputs and outputs . . . _are_ the arcs of that graph"
|   [emphasis added]?  Surely, as in the immediately following
|   definition, "... are connected by the arcs" is what is wanted?

Uhm, yeah, probably :-)

| 3 Language Constructs
|  I'd prefer to have "for-each construct", "viewport construct", etc.,
|  rather than "for-each component".
| 3.1 Pipeline
|  I find the first sentence pretty baffling. . .

I need to make a "terminology" pass.

| 3.2 For-Each
|   Needs some brief motivation, I think, along the lines of
|    "In cases where a component or sub-pipeline requires a single
|    document input, but a pipeline needs to process a sequence of
|    documents with that component, the for-each construct can be used."
|   The term 'aggregation' is nowhere defined.  I think nothing is lost,
|   and indeed we're better off, if the definition reads:
|    The result of the for-each is a sequence of the documents produced
|    by processing each individual document in the input sequence.  If
|    the for-each subpipeline declares multiple outputs, each output is
|    a sequence of the documents produced on that output by each
|    iteration.


| 3.4 Choose
|   Paras 1 and 5 seem to contradict each other wrt the presence of a
|   default.

Clarified, I think.

| 3.5 Try/Catch
|   That word 'aggregation' again :-)


| 4 Syntax
|   I'm OK with using 'instantiate' to describe the relationship between
|   components and steps (although I'm still no sure about using
|   'component' for both type and token throughout the first three
|   sections), but I would much prefer to talk about 'representing' or
|   'encoding' a pipeline. . .  Also in 4.2 Pipeline Vocabulary

I need to make a "terminology" pass.

| 4.1.1 Specified by URI
|   Have we decided whether the schema type of the *href* attribute is
|   xs:anyURI or (list of xs:anyURI)?  I _think_ I see no reason not to
|   support the latter.  Makes the validate component much simpler -- I
|   just write
|    <p:input port="schema"
|             href="http://www.example.com/myvocab
|                   http://www.w3.org/2001/06/soap-envelope.xsd"/>

I don't think we talked about it. Does anyone have qualms about making
it a list of xs:anyURI?

| 4.1.1 Specified by source
|   The word 'ancestor' is not defined, or immediately obvious -- how
|   about
|     ". . . must either be declared on some ancestor (e.g. an enclosing
|      _choose_ or _for-each_) or it must be. . ."


| 4.1.1 Specified by here document
|   More than one (non-document) child == sequence allowed?

I don't think so. It would raise the problem of deciding where PIs
and comments bind.

| 4.1.2 Editorial Note
|   Well, we did have step is instantiation of component, in turn
|   described by component declaration.
|   We could have component for both type _and_ token, which is what you
|   seemed to be going for in section 3, with p:component-declaration
|   describing the type and p:component corresponding to an instance.
|   But p:step is so nice and short . . .

Yeah, it's all a bit of mess now.

| 4.1.3 Syntactic shortcuts
|   Arghh!  Now we're calling choose a _user-defined_ component.  Surely
|   not.  Stick with 'construct', please!

I think we're calling them step containers now.

|   [note here and elsewhere you haven't made up your mind wrt p:param
|   vs. p:parameter -- I vote for p:(declare-)parameter, because we're
|   going in the opposite direction from xslt, i.e. if we used p:param,
|   we'd have the following confusing paradigm:
|      p:declare-param is to p:param as xsl:param is to xsl:with-param]


| 4.2.1 p:pipeline Element
|   [I'm only going to say this once :-]
|   I'd much prefer 
|     "A p:pipeline represents a _pipeline_.  Its children represent
|     declarations of the inputs, outputs and parameters that the
|     pipeline exposes and the _subpipeline_ that constitutes
|     its definition."


| 4.2.8 p:for-each Element
|   The term 'aggregate' is nowhere defined, and I find it a bit opaque
|   at best and misleading at worst.  How about replacing the last _two_
|   sentences before the example with
|      For each declared output, the processor will collect all the
|      documents that are produced for that output from all the
|      iterations, in order, into a sequence.


                                        Be seeing you,

Norman Walsh
XML Standards Architect
Sun Microsystems, Inc.

Received on Monday, 11 September 2006 16:37:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:48 GMT