- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Fri, 26 May 2006 09:41:37 +0100
- To: public-xml-processing-model-wg@w3.org
Hi,
http://www.w3.org/XML/XProc/2006/05/25-minutes.html say:
> Richard: If all the inputs are available as documents that you can refer
> to by name in XPath expressions, this results in a hidden dependency
> within XPaths.
> ... In order to determine which components have to have been evaulated,
> you have to peek into the XPath to see what inputs it relies on.
> ... That seems to be a minor implementation annoyance but a good way of
> hiding dependencies.
> ... which is a bad thing.
> ... It means that two things in apparently unrelated branches of the
> pipeline may have to wait for each other because of the XPath expression
> one uses.
> ... It really is just a syntax issue on one level in that you could draw
> all the lines in explicitly. It's just that it's burried deep down in the
> syntax.
I think I understand the point Richard's making here: if we have a
conditional like the following (in my preferred syntax):
<p:choose>
<p:input name="input" />
<p:variable name="xsl-stylesheet-pi"
select="$input/processing-instruction('xsl-stylesheet')" />
<p:when test="$xsl-stylesheet-pi and
contains($xsl-stylesheet-pi,
'type="text/xsl"')">
...
</p:when>
<p:otherwise>
...
</p:otherwise>
<p:output name="output" select="$output" />
</p:choose>
then to understand that the condition:
$xsl-stylesheet-pi and
contains($xsl-stylesheet-pi,
'type="text/xsl"')
relies on the input to the <p:choose>, the implementation has to look at
the condition XPath and see that it refers to $xsl-stylesheet-pi, then
look at the definition of the variable $xsl-stylesheet-pi and see that
the XPath that supplies its value refers to the variable $input, and
thus work out that the condition relies on the input to the <p:choose>.
What I don't understand is how things are actually all that better when
the input is referenced by setting the context node instead. The above
would look like (|s indicate changed lines):
<p:choose>
<p:input name="input" />
| <p:output name="output" ref="output" />
| <p:variable name="xsl-stylesheet-pi" context="input"
| select="processing-instruction('xsl-stylesheet')" />
<p:when test="$xsl-stylesheet-pi and
contains($xsl-stylesheet-pi,
'type="text/xsl"')">
...
</p:when>
<p:otherwise>
...
</p:otherwise>
</p:choose>
The implementation still has to look inside the condition XPath to work
out that $xsl-stylesheet-uri has been referenced. It's then easier to
work out that this variable references the input to the <p:choose>, but
the implementation still has to look for variable references within the
XPath expression used to set the $xsl-stylesheet-pi variable in case
other variables (which might rely on other documents) have been referenced.
As far as I can tell, so long as we have variables that can be set based
on inputs then we have to look in XPaths for dependencies. Of course we
could ban variables altogether, or only allow them to be used to
manipulate parameter values, in which case the above would have to be
written:
<p:choose>
<p:input name="input" />
<p:output name="output" ref="output" />
| <p:when context="input"
| test="processing-instruction('xsl-stylesheet') and
| contains(processing-instruction('xsl-stylesheet'),
| 'type="text/xsl"')">
...
</p:when>
<p:otherwise>
...
</p:otherwise>
</p:choose>
Although I don't find the above very usable (because of the required
repetition of location paths), it has the virtue of being absolutely
clear where the dependencies lie.
> Richard: I think the issue of strings is a red herring. Though I agree
> that we should restrict them to strings now, that doesn't mean we can't
> make them more complex in the future.
> ... If the functionality that's needed is the ability to refer to multiple
> documents, it could be done more explicitly. There could be a syntax that
> bound variables to the names of outputs of other steps. That at least
> would make it expicit which ones were being used.
Fromn a user's standpoint, I find something like:
<p:choose>
<p:input name="doc1" />
<p:input name="doc2" />
...
<p:variable name="doc1" context="doc1" select="." />
<p:variable name="doc2" context="doc2" select="." />
<p:when test="name($doc1/*) = name($doc2/*)">
...
</p:when>
...
</p:choose>
more obscure than:
<p:choose>
<p:input name="doc1" />
<p:input name="doc2" />
...
<p:when test="name($doc1/*) = name($doc2/*)">
...
</p:when>
...
</p:choose>
I don't see how the first option is more explicit. Under the first
option, to work out what the condition is actually doing I have to work
back through the variable definitions to the inputs. In the second, I
have one less redirection to worry about, which makes things easier for me.
If we have the functionality of queries over multiple documents *at
all*, then I really don't see how one method is any simpler than the
other for the implementation, and I definitely think that assigning
inputs to variables is easier for the user.
[snip]
> Proposal: XPath expressions will be evaluated over exactly one input,
> syntactic details unresolved.
I feel pretty strongly that this is the wrong way to go, but if I
haven't managed to convince anyone of that by next week then I don't
want to hold up progress on the draft.
Cheers,
Jeni
--
Jeni Tennison
http://www.jenitennison.com
Received on Friday, 26 May 2006 08:41:57 UTC