W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > September 2006

Thinking about graphs and containers

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Thu, 21 Sep 2006 18:32:27 +0100
To: public-xml-processing-model-wg@w3.org
Message-ID: <f5b64fh86s4.fsf@erasmus.inf.ed.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I think one of the reasons I'm attracted to the nested-graph story is
that it allows us to tell a very simple story about the arcs of every
graph (all quotes from our forthcoming WD [1]):

  "A _pipeline_ is an acyclic, directed graph of components connected
   together by inputs and outputs"

  "The components contained in a pipeline or other container are the
   nodes of a flow graph. The input and output ports of the components
   are connected by arcs in that graph."

  "[w]hat flows between components as inputs and outputs are
   exclusively XML documents or sequences of XML documents."

What all this adds up to is the arcs in the flow graphs are very
simple -- they cary documents or sequences of documents from an output
of one component to an input of another.  Or, to put it another way,
if we are careful to draw all and only arcs which meet that
description, then we can be confident that we can understand every
flow-graph as meaning: Each time this flow-graph is evaluated without
error, one or more documents will flow down each arc, and each
component in the graph will be evaluated once.

This actually means that something I was a bit embarrassed about wrt
the nested-pipeline story actually turns out to be a virtue: When you
look at the whole picture, with the boxes exploded and everything,
there are some gaps.  Those gaps are between the constructs and their
sub-pipeline(s).  And that's precisely right, because in (almost*)
every case, that's where the magic is.  It's not right to draw arcs
- From the top of the *choose* box to the inputs of its subpipelines,
because each time the choose is evaluated documents will flow down
only one of them.  It's not right to draw an arc from the top of a
*for-each* box to the input of its subpipeline, because each time the
for-each is evaluated, documents will flow down that arc more than
once.  Likewise for the output connections.

In other words, the semantics of each construct tell you how its
subpipelines get their input(s), and what happens to their output(s),
but in general it's _not_ as simple as just plugging in a vanilla arc.

I hope this helps, even if it's not fully baked.

ht

* *group* is an (the only?) exception, in that its input and output
  connections are _not_ magic, I don't think, and so they _could_ be
  captured with vanilla arcs.

[1] http://www.w3.org/XML/XProc/docs/langspec.html
- -- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFFEsyrkjnJixAXWBoRAniSAJ9wIAS3E2ORoqW6qk+At1q9eV8N5gCfVTXB
t+Asei7oRwiPS9Fo55dQEDg=
=9OTF
-----END PGP SIGNATURE-----
Received on Thursday, 21 September 2006 17:32:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:48 GMT