- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Fri, 03 Nov 2006 09:36:16 +0000
- To: public-xml-processing-model-wg@w3.org
Hi,
Norman Walsh wrote:
> On input, viewport, and for-each, it seems like we have three choices:
>
> 1. Use 'select' semantics in every case.
> 2. Use 'match' semantics in every case.
> 3. Use 'select' semantics for some and 'match' for others.
>
> I have a marginal preference for 1 or 2 on the basis that it's easier
> to explain to users. And I think select semantics are easier to explain
> and make more sense in the case of p:input, so I favor 1.
>
> But Alex and Henry have both expressed a preference for match
> semantics at least on viewport and maybe on for-each.
>
> What do others think?
I don't buy Alex's streamability arguments: there's a subset of XPath
expressions that are streamable, and there's a subset of patterns that
are streamable. Patterns don't automatically give you streamability. I
don't think the ease of doing streamability analysis should be of high
priority.
I'm more open to the usability arguments, but I think they can be quite
subjective. For example, although Henry finds it easier to write "div"
than "//div", the vast majority of XSLT newbies will naturally write
"//div" in their match patterns. From my experience, I think newbies in
general don't understand the difference between the two and will tend to
use expressions as a default (and wonder why they don't work when they
try to use one that isn't a pattern).
Looking at the distinction in XSLT, patterns are used when you are
already looking at a specific subset of nodes. For example, the 'count'
attribute on <xsl:number> holds a pattern because you're already only
looking at the ancestors of the current node; in XSLT 2.0, the
'group-starting-with' attribute on <xsl:for-each-group> holds a pattern
because it's compared with the nodes in the selected sequence. The
'match' on <xsl:template> only looks at the nodes that you've applied
templates to.
It seems to me that in XProc, the subset of nodes we're examining for
matches are "all the nodes", which isn't really a subset. Given that,
for consistency with XSLT, I think that we should really be using an
expression.
I'm particularly concerned by the idea of for-each using a pattern,
given that <xsl:for-each> uses an expression. I think that the 'select'
on input and for-each should do the same thing, since they have the same
semantic. I don't have strong views on viewport, since the 'select'
there has a completely different semantic (and needs to be renamed anyway).
I'm also concerned that if we use patterns then users can no longer
(easily) do some things that they might want to do, and I'd like to see
some discussion on that. There are three things that you can do with an
expression that you can't easily do with a pattern: identify nodes with
a function, use axes other than child, attribute and descendant-or-self,
and use a positional predicate on a node set. But each of these *can* be
written as a pattern:
For example,
<p:for-each>
<p:input port="doc" ... select="rdf:resources(.)" />
</p:for-each>
might return one document per "resource" in an RDF document. The
equivalent is:
*[count(.|rdf:resources(/)) = count(rdf:resources(/))]
Unless you have a fairly clever implementation it's going to be pretty
computationally expensive, and it's not something that most users will
be able to do without consulting FAQs.
Another example is
//dt/following-sibling::dd[1]
to get the first definition for each term in a definition list. Given
that the context node for these expressions is always the root node,
these are fairly easy to rewrite as patterns:
dd[preceding-sibling::*[1][self::dt]]
Finally,
(//div)[5]
to get the fifth <div> element in the document. The equivalent pattern is:
div[count(preceding::div) = 4]
All of these are pretty rare, I imagine.
We might want to try to look into the future; if we were making the
choice between XPath *2.0* expressions and XSLT *2.0* patterns, which
would we choose? Moving to 2.0, there are more functions and operators
that return nodes and aren't allowed in patterns. Would we consider it
reasonable for users to do:
/root/* except /root/head
for example? I think we would, and I think we would find it hard to
change to allow this later on if we stuck with patterns now.
A final minor concern is that if we use patterns then we introduce a
dependency on XSLT, and I'm not sure we want to do that.
In summary, input and for-each have the same semantic and should use an
expression, in my view. Viewport has a different semantic, so I'd be
happy for it to use a pattern if there were a good argument for it to do
so, but I haven't yet heard one.
Cheers,
Jeni
--
Jeni Tennison
http://www.jenitennison.com
Received on Friday, 3 November 2006 09:36:44 UTC