Comments on August 10 editors' draft through section 2.6 from Henry S. Thompson on 2007-08-16 (public-xml-processing-model-wg@w3.org from August 2007)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Thu, 16 Aug 2007 15:38:57 +0100
To: public-xml-processing-model-wg <public-xml-processing-model-wg@w3.org>
Message-ID: <f5bwsvv4k8u.fsf@hildegard.inf.ed.ac.uk>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Abstract:

 producing --> produce

 I think this sentence is too obscure for where it occurs:

   "Some pipelines are entirely self-contained, starting with input
    derived inside the pipeline and producing no XML output."

 and the ", though they are not required to do so." ending to the
 previous sentence is unnecessary, given the "generally" at the
 beginning.

 How about just

   Pipelines generally accept one or more XML documents as input and
   produce one or more XML documents as output.  Pipelines are made up
   of simple steps which perform atomic operations on XML documents
   and constructs similar to conditionals, loops and exception
   handlers which control which steps are executed.

and if you _really_ think we need to say something more, add

   Input and output of some kinds of non-XML documents is possible via
   special-purpose steps.

SoTD:

 We can't go to Last Call with this sentence in place:

  "This draft addresses many, but not all, of the design questions that
   were incomplete in previous drafts."

Introduction:

   "whereas compound steps include a subpipeline of steps within
    themselves."
    -->
   "whereas compound steps control the execution of other steps,
    which they include in the form of one or more subpipelines."

2 Pipeline Concepts:

   "Unless otherwise indicated, implementations must not assume that
    steps are functional . . ."

 I can't find anywhere where we give a functionality guarantee -- can
 we just remove the "Unless otherwise indicated, "?

2.1 Steps:

 Should we make

  "The steps that occur directly inside a compound step are called
   contained steps."

 more accurate, e.g. by saying

  "The steps that occur as children or, in the case of *p:choose* and
   *p:try*, as grandchildren, of a compound step are called contained
   steps."

 -----

 I'm not clear why we call out implementation-defined compound steps
 (pfx:other-step in the production for 'subpipeline').  Where do we
 define the behaviour of a processor when it sees one such?  How does
 a processor tell the difference between a pfx:other-step and a
 ipfx:ignored?  The way I read 3.6 only ipfx:ignored should be in that
 production.  I think we should get rid of the 'compound' part of 4.7
 altogether, and simply observe that extensions may be simple or
 compound, and will have implementation-defined syntax and semantics.

 After all, the 'simple' part of 4.7 isn't about extensions at all,
 it's about user-defined step types, which are distinct from
 extensions.

 -----

 When/why did we decide that a step can't have both an input and an
 output port with the same name?  What bug does this forestall?

2.1.1 Step names:

 In the discussion of the example, both p:choose and p:when are given
 names, so one concludes they are steps.  But in the production for
 subpipeline above, only p:choose appears.  I think we need to say
 something explicit, sooner rather than later, about the slightly
 anomalous state of p:when, p:otherwise and p:catch, perhaps along the
 lines of:

   p:when, p:otherwise and p:catch have a special status.  Although
   they behave in many ways as compound steps, in that they wrap a
   subpipeline and may specify inputs and outputs, they are not
   independent.  They *must* only appear immediately within p:choose
   (p:when and p:otherwise) or p:try (p:catch).

   On the other hand p:choose and p:try are special too, in that they
   *must not* contain other steps directly, but only indirectly.

 which might go somewhere near the end of 2.1.

2.2 Inputs and Outputs:

 The list of allowed sources for input bindings needs something added
 along the lines of

   * A special port provided by an ancestor compound step,
     e.g. 'current' for p:for-each and p:viewport

 Similarly, the second bullet of allowed destinations for outputs
 should read somethign like:

   * One of the outputs declared on the top-level p:pipeline step, or
     on its container.
 
 Note the asymmetry between the above two changes -- is it correct?

2.3 Primary Inputs and Outputs:

 I'm not happy that the definitions of primary i/o are contractdicted
 by the prose which immediately follows.  I suggest they be rewritten
 along the following lines:

  Definition: If a step has a document input port which is explicitly
  marked "primary='yes'", or if it has exactly one document input
  port, and that port is _not_ explicitly marked "primary='no'", then
  that input port is the primary input port of the step.] If a step
  has a single input port and that port is explicitly designated as
  not being the primary input port, or if a step has more than one
  input port and none is explicitly designated the primary, then the
  primary input port of that step is undefined.

  It is an error (xxx) for a step to explicitly designate more than
  one document input port as primary.

 and similarly for output ports.

 ------

 The following can't be right:

  "Additionally, if a compound step has no declared inputs and the
   first step in its subpipeline has an unbound primary input, then an
   implicit primary input port (named \u201csource\u201d) will be added
   to the compound step."

Compound steps other than p:pipeline _can't have_ declared inputs.
The passing of the _default readable port_ down is handled by the
general rule in 2.7, modified if necessary by the individual compound
steps.

I think this was drafted in false parallel to the subsequent para, and
should be changed to _only_ apply to p:pipeline, as follows:

  "Additionally, if a p:pipeline step has no declared inputs and the
   first step in its subpipeline has an unbound primary input, then an
   implicit primary input port (named \u201csource\u201d) will be
   added to the p:pipeline (and consequently bound to the first step's
   primary input port)."

A similar parenthesis could be added to the parallel sentence for
outputs:

  "If a compound step has no declared outputs and the last step in
   its subpipeline has an unbound primary output, then an implicit
   primary output port (named \u201cresult\u201d) will be added to the
   compound step (and consequently the last step's primary output will
   be bound to it)."


- ------

 Overall, I am a bit worried about the bulleted lists here, as they
 are not all exactly right -- perhaps at the end of the section we
 could add something such as

   The above lists of possible port-to-port connections are only
   summaries---the exact details are given below in sections 2.7 and
   5.11.

2.5 Parameters:

 A parallel change to the one suggested above should be made to the
 definition of primary parameter input port.

- ------

 availble -> available


[C.2 Fragment Identifiers:

  Excellent!]

I'm going to ship this and pick up with section 2.7 in a subsequent
message.

ht
- -- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFGxGGBkjnJixAXWBoRAkfNAJ9bS+VtQcl6JamhVs3s56rKZwyBmgCfWG3Q
Pm7PtjutdjiT9Qclwiq3OTk=
=NVry
-----END PGP SIGNATURE-----
Received on Thursday, 16 August 2007 14:39:07 UTC