- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Wed, 06 Jun 2007 20:06:44 +0100
- To: public-xml-processing-model-wg <public-xml-processing-model-wg@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I want to try to work this through in _excruciating_ detail, because I'm not (yet) convinced we have a coherent proposition. First, some terminology: An XPath expression is evaluated either *by the runtime*, that is, the pipeline enging itself has to do the evaluation as part of its overall work, or *by a component*, that is, a component implementation itself knows that some string is an XPath expression and needs to get it evaluated. In either case, of course, the real evaluation work may well be done by a library, indeed the same library -- what really matters is who decides to do the evaluation, when they do it, and what they specify for the context. A component *binds the position for a port* by doing whatever is necessary to determine what the value of position() will be when an XPath expression is evaluate _by that component_ with respect to that port. Second, a stipulation: We are going to use position() to signal where we are in a sequence of documents. We would like this to be true equally when a component is processing a sequence of inputs on some port, and when a component is iterating over a sequence (today, that means p:for-each and p:viewport). Specifying exactly _where_ position() means _what_ is the goal of this message. More terminology: call the first use/meaning the *sequence* position and the second use/meaning the *iteration* position. Third, an observation: Only a component which itself accepts sequences as input can possibly ever _bind the sequence position_, because only it can know how many of its input documents it has read. For atomic components, this means that only XPath expressions evaluated _by that component_ can access the _sequence_ position. So far, so good, I think. We can now state carefully one requirement on components: R1) Components which evaluate XPath expressions MUST a) For each XPath they evaluate, identify the input port with respect to which they evaluate it; b) _Bind the position for all ports_ to 1 before evaluating any XPath expressions, and _increment the position of a port_ after finishing the processing of each input document on that port. [Note that for ports which don't accept sequences, this means the position will always be _bound to_ 1.] But what about iteration? I won't bore you unless you press me, with the reasoning that gets me here, but this is the only way forward I've found which I think works cleanly. The crucial point is to observe (or be willing to stipulate) that the _sequence_ position for an iterator is the _iteration_ position for its contained components. Accordingly we can get what we need as follows: R2) Compound components which iterate one or more subpipelines with respect to some document sequence MUST arrange to a) _bind the position_ (for the runtime, see below) to 1 before the first iteration of any subpipeline; b) _increment the position_ (for the runtime) before each subsequent iteration of any subpipeline. By 'for the runtime' is meant that the position binding is the one which will be used when an XPath expression is evaluated _by the runtime_ during the execution of the relevant subpipeline(s). Finally, a necessary observation: Options given a value with 'select=' have the specified XPath evaluated *by the engine*. Options known to a component to be XPaths (typically, but not necessarily, given a value with 'value=') have the specified XPath evaluated *by the component*. Superficial consequence: a) position() in <p:option ... select='...position()...'/> gives _iteration_ position; b) position() in <p:option ... value='...position()...'/> gives _sequence_ position (if it's treated as an XPath at all). (in either case, when no sequence/iteration is relevant, position() gives 1. For _iteration_ position, this requires the top-level pipeline to _bind the position_ (for the runtime) to 1.) I find it easiest to think of this in terms of position having a sort-of special slot in the environment. p:for-each and p:viewport initialise and increment that value for each run of their subpipelines, so select= options in those subpipelines can access it == the iteration number with position(). Sequence-consuming components internally initialise and increment a binding for that value local to themselves, so e.g. value= options which they know to be XPaths and evaluate can access it == the sequence number with position(). Phew! This works, but will _anyone_ understand it? Can someone explain it in simpler terms, supposing you agree it's right in principle? Examples to follow (sorry, this has taken _far_ too long and I have to go cook!). ht - -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh Half-time member of W3C Team 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFGZwXFkjnJixAXWBoRArGHAJkBhZv1L3FQ377SASmeEmnctiPzhQCdFkds OSbUmE7FnWBglhyOu4xPrsQ= =OtIG -----END PGP SIGNATURE-----
Received on Wednesday, 6 June 2007 19:06:53 UTC