- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Sat, 28 Sep 2002 12:16:06 +0100
- To: w3c-xsl-wg@w3.org
- Cc: public-qt-comments@w3.org
Mike Kay wrote: > This note responds to an issue originally raised on 11 Jan 2002 by > Jeni Tennison, which is archived at: > > http://lists.w3.org/Archives/Public/xsl-editors/2002JanMar/0050.html > > The XSL WG has considered the proposal on a number of occasions, and > we have finally decided not to take this route. Thanks for responding to my proposal; I assumed that it had been discarded months ago! I can't say that I'm surprised that the WG has decided not to take it forward, and I am sympathetic to the pragmatic considerations that you describe. Anyway, I didn't really want my first post as a member of the WG to be like this, but I didn't get to argue my case beyond the initial proposal and I guess that a position statement would probably be useful for those that haven't read my posts elsewhere, and that this is as good an opportunity as any to make one. I hope you'll permit me this one indulgence before I start working with you on the important goal of getting XPath and XSLT out the door. --- > On the minus side, the boundary between XSLT and XPath becomes less > clear-cut, and harder to explain. At present, XSLT instructions are > concerned either with creating nodes and constructing trees, or with > controlling the execution of other instructions (iteration, > conditionals, calls). XSLT instructions at present can only return > newly constructed nodes; XPath expressions never do so (except by > making a function call to XSLT). This has the merit that the > stylesheet tree always acts as a template for the construction of > the result tree, which is a model that users find easy to > understand. I do not think that users find this model natural or easy to understand. How many times do we see evidence that XSLT 1.0 users think that: <xsl:variable name="foo"> <xsl:value-of select="foo" /> </xsl:variable> sets the $foo variable to a string or to an element rather than a result tree fragment? I think that the design of XPath+XSLT 1.0 may make us *think* that users understand the distinctions between the three because (due to the weak typing) it rarely matters when they don't. I think that we'll find with XPath+XSLT 2.0 that they actually don't understand the distinctions, and that they will be revealed to be a lot more confused than we imagine. Further, as I've argued before, I don't believe that "XSLT for newly-constructed nodes, XPath for other things" actually works as a distinction. Speaking for myself, I find it far more natural to make the distinction in terms of the processes I can perform (e.g. iterating, conditions, creation of nodes) than in terms of what I perform those processes on. Switching between the two languages just because I'm working with a different kind of data doesn't seem at all natural to me as a user (though it may make perfect sense for an implementer). Judging from the feedback that I've received after this proposal and other posts, I do not think that I'm alone in feeling this way. One of the reasons that I don't think it works is that the main thing that we do with XSLT stylesheets is process a source document. When we process a source document using XSLT, we get a lot of help, especially when dealing with recursive structures, from the ability to apply templates. When we see someone doing something like: <xsl:for-each select="node()"> <xsl:choose> <xsl:when test="self::strong"> ... </xsl:when> <xsl:when test="self::emphasis"> ... </xsl:when> ... </xsl:choose> </xsl:for-each> we leap on it as being bad stylesheet design. And yet when we have to process source trees using XPath 2.0 (because we want to generate simple values, or collect existing nodes), we can't use the template approach, so we're forced to use this ugly and hard-to-extend construction. Let me give you an example. This is a simplified version of a real problem from a real project that I'm working on. In this project, documents are of three kinds: data, metadata and collections. A collection document points to multiple data or other collection objects; each data document points to a metadata document. My task was to collect all the unique <creator> elements from all the metadata of all the referenced data documents to create some metadata for the collection as a whole. The best way to approach this, I think, is to collect together all the creator metadata and then use distinct-values() to get the unique <creator> elements. According to the current XPath/XSLT split, this is an XPath job because it needs to return existing nodes. However, because the elements need to be collected recursively, I have to use a recursive function to do it: <xsl:variable name="creators" select="my:creators(/)" /> <xsl:function name="my:creators"> <xsl:param name="docs" select="/.." /> <xsl:result select=" for $d in $docs return if ($d/collection) then my:creators($d/collection/data/@xlink:href) else document($d/data/metadata/@xlink:href)/metadata/creator " /> </xsl:function> Note that the if/then/else within the for loop here is exactly like the choose/when/otherwise in the example above. This can be compared to using the natural mode of operation in XSLT, which is to use templates to traverse the source trees. With sequence construction in XSLT, the solution would have been: <xsl:variable name="creators"> <xsl:apply-templates select="/" mode="creators" /> </xsl:variable> <xsl:template match="collection" mode="creators"> <xsl:apply-templates select="document(data/@xlink:href)" mode="creators" /> </xsl:template> <xsl:template match="data" mode="creators"> <xsl:item select="document(metadata/@xlink:href)/metadata/creator" /> </xsl:template> As well as being a more "natural" (to an XSLT user) solution to the problem, if I now decide that I actually want to create some 'creator' elements and return those new nodes rather than the existing ones (perhaps because I want to alter them a little -- add an extra attribute or something), this change is very easy to make in the latter stylesheet, but requires total rewriting of the former. > Under the proposal, it would not always be obvious when XSLT > instructions are building a tree and when they are building a > sequence. For example, consider: > > <xsl:template name="x"> > <xsl:text>[</xsl:text> > <xsl:value-of select="$param"/> > <xsl:text>]</xsl:text> > </xsl:template> > > If this is called while constructing a tree, it returns a single > text node. If it could also be called while constructing a sequence, > it would return a sequence of three text nodes. Users might easily > change the implementation of the template (for example, to use > concat()), without realising the consequences for some callers. I thought that I'd expressed this in the original proposal, but perhaps not. I'd made the assumption that the rule that says that text nodes should be combined when added to a tree would be extended to say that when you create a sequence of newly-created text nodes as in the above, they are merged into a single text node. To create a sequence of separate newly-created text nodes (a rare requirement, in my opinion), you would use <xsl:item> as follows: <xsl:template name="x"> <xsl:item>[</xsl:item> <xsl:item> <xsl:value-of select="$param"/> </xsl:item> <xsl:item>]</xsl:item> </xsl:template> This resolves the cited problem, I think, but perhaps I'm missing something. > There are arguments in favour of keeping XPath as small as possible > (and therefore putting functionality at the XSLT level wherever > there is a choice), but there are also arguments the other way: > > (a) There are advantages both to implementors and to users in > maximising the common subset that XSLT shares with XQuery, and this > argument leads to such shared functionality sitting in XPath. While I strongly agree that where there's shared functionality between XQuery and XPath+XSLT there should be a shared data model and set of operations and functions, I do not necessarily believe that this means there should be a shared syntax. I do not think that either language benefits from being shackled together so closely: XQuery ends up with "line noise" that they dislike, XPath ends up with verbosity that XSLT users dislike. Of course they should use shared syntax where it's appropriate (for example location paths, operators and function calls), but I think that there's a balance to be made here and that the current extreme overlap will damage XSLT in the long run. > (b) Functionality available in XPath (such as conditional > expressions) is usable in contexts such as key definitions and sort > keys, where otherwise a call-back from XPath to XSLT would be > needed. I'm not sure what you mean by a call-back from XPath to XSLT? Presumably you mean that the user would have to write an extension function in order to deal with it? If that's a problem, then I think it's a problem in the current set-up as well. As I outlined in my original proposal, and demonstrated in the above example, there are lots of things that XPath cannot do all on its own; you're not going to be able to support users doing everything in XPath unless you have function definitions in XPath, and I *really* hope that ain't gonna happen. I should also make clear, in case it wasn't already, that I am fully in favour of simple conditional expressions (e.g. condition ? true-value : false-value) and simple mapping expressions (e.g. sequence -> map-function) within XPath 2.0. I believe that these, together with date-time data types and the new aggregation functions (min(), max(), avg()) would handle well over 80% of the requirement for functionality in XPath to support key and sort values. The remainder can always be handled through user-defined functions. > (c) Functionality implemented at the XPath level is probably easier > to optimize, because within the context of an XPath expression, > there are no side-effects, whereas XSLT instructions have the > side-effect of writing nodes to the result tree. I'll have to take your word for that. I don't understand, though, how XQuery manages to deal with optimising around the side-effect of writing to the result tree (which presumably it can do?) and yet XSLT can't. I don't understand why: for $b in //books { <book>...</book> } is any more optimisable than: <xsl:for-each select="//books"><book>...</book></xsl:for-each> These seem to be slightly different syntaxes for roughly the same thing; I don't see why they can't be interpreted in the same way. Further, I have to say that I don't think that optimisability is as high on my priorities for XPath/XSLT 2.0 as usability. > (d) Functionality implemented at the XPath level is available in > standalone XPath environments, for example XPath used within the > context of XPointer or DOM. Since the XPath data model relies > critically on sequences, some mechanism for constructing sequences > is needed that is not dependent on XSLT. Actually, I view this as an argument for *simplifying* XPath rather than one for packing as much functionality as possible into it. I think that it's much more likely that a Java programmer using DOM would use Java methods to construct sequences of integers than that they'd use XPath to do so. Adding functionality to XPath just makes it harder to get XPath added to XPointer/DOM/etc. implementations. Even with XPath 1.0, other standards that use XPath 1.0 (I'm thinking W3C XML Schema and XForms in particular) have defined their own subsets because they didn't want to inflict the burden of implementing all of XPath on the implementers of their technologies. This will be even more true with XPath 2.0. > Finally, there are pragmatic considerations. Although it would > almost certainly be possible to design a language along the lines of > Jeni Tennison's proposal that met all the XSLT/XPath 2.0 > requirements, it would require a lot of rework and a lot of > negotiation between the XSL and XQuery working groups. If the > current language were badly broken, we would be justified in > tackling this rework. But the current language works, and we want to > get it finished. In fact, if we did embark on making the changes > required by this proposal, there is a serious prospect that we would > end up with a language in which we had added sequence construction > to XSLT but failed to remove anything from XPath. We would thus have > increased duplication between the languages instead of reducing it. Oh well, since I'm writing this diatribe anyway I'll just say that no one in the XSLT community will thank the XSL WG for bringing out a standard quickly if that standard fails to recognise the requirements of that community. Just because you can do all you need to do in a language (it "works") does not mean that the language is well-designed or easy to use. I do understand the desire to see XSLT get out soon, and to not discard the time, effort and patience that the XSL WG has put into the current design, but I also think that it's worth taking the time to get it right. It will involve even more rework to have to start again in however-many months time if it's rejected as a Candidate or Proposed Recommendation. --- Apologies for the rant. Now I've got it off my chest, I'm looking forward to the real work. I hope that you'll bear with me as I get up to speed on the latest drafts and discussion. > (Welcome to the group, Jeni!) Thanks Mike :) Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/
Received on Saturday, 28 September 2002 07:23:28 UTC