Re: Resolution of XSLT issue 99: Constructing Sequences in XSLT from David Holmes on 2002-11-27 (public-qt-comments@w3.org from November 2002)

From: David Holmes <davidgholmes@mindspring.com>
Date: Tue, 26 Nov 2002 21:13:29 -0500 (EST)
To: "Kay, Michael" <Michael.Kay@softwareag.com>, <public-qt-comments@w3.org>
Message-ID: <002901c295bc$3f80c870$6401a8c0@intrex>

MessageDear Michael,
Thanks for your prompt reply and for being so patient to explain. I understand your points (although I confess that I have some homework to do on XQuery operation) and strongly respect the desire to maintain backwards compatibility. However, I'd like to run one final "thought experiment" to feel satisfied that we aren't giving up the potential of the XSLT paradigm and because I think we may be able to have our cake and eat it. I'm going to suppose that the technical issues that you raised relating to sibling nodes and xsl:value are tractable (based on 50% ignorance, 40% experience doing an implementation, and 10% willpower).

It strikes me that the key paradigm that flies in the face of having an XSLT engine emit/return sequences is that there is an implicit assumption that you are generating a principal document - a singleton sequence containing one document node and all its descendants. No question for backwards compatibility that we can't break that paradigm. Suppose, however, that we define a new XSLT document element, call it <xsl:query/> for grins (<xsl:procedure/> might be less controversial!). Under <xsl:query> we have the usual compliment of XSL instructions that can appear under <xsl:stylesheet/>. Processing is more or less the same - matching template rules (Jeni Tennison's <xsl:item> details and others to be worked out!) but instead of creating an implicit document, we are creating an implicit sequence. My viewpoint is that this is not a different language paradigm but is another style rather like the difference between standard stylesheets and the literal result style. As stated, there are details to be worked out but I believe there is much value in avoiding having XSLT dead-end as a single document generator (I am aware of <xsl:document> but that's not the same issue). Having accepted this possibility, the strategy will be to examine how existing XSL instructions behave and aim for consistency. This process will help to inform and formalize the behavior of existing XSLT instructions operating under <xsl:stylesheet>; especially the ones with string codependency! For example, I believe that the moment someone wrote "a sequence contains items" in a W3C spec, the genie was out of the bottle. Now, if <xsl:text> has a content constructor, or since <xsl:value> has a select XPathExpression, how do you stop a user generating a sequence of atomic values? The answer to me is that you don't and don't have to. In fact, I believe that the signatures of <xsl:text> and <xsl:value-of> can be practically identical (optional select/content, separator, d-o-e) with the only difference being that the former generates a typed text node and the latter generated a typed value.

I believe that this approach might allow us to explore the possibilities in 1) a backwardly compatible way that 2) allows users to leverage their XSLT knowledge and 3) helps to formalize XSLT as a whole in the new "strong-typed" world. I think the possibility of seeing familiar XSLT as a bone-fide query language in some vendor's DBMS makes this option worth pursuing.

BTW. I concur that the two language model has some interesting benefits, especially in the way that it provides a host for XPath, and that there is a division of responsibilities that should be maintained. I would phrase this as saying that the XSLT instructions are more generally responsible for *item* generation and flow-of-control while the XPath expressions are good for computing values. I would differ though in the view that there is a fundamental difference or asymmetry between an XSLT instruction and an XPath expression. In fact it is my design experience that while XSLT instructions generally "push" items and XPath expressions return sequences, the roles can be inverted; an XSLT operation can return a sequence and an XPath expression can push it's items. There is also an interesting transition stage in XSLT where a sequence-constructor(oops - content-constructor!) behaves more like an expression returning a sequence (which might need to be atomized e.g. xsl:value-of). One way to look at this (I used to be a physicist) is to say that the symmetry has been broken by our conventional notions of what an XSLT instruction is versus what an XPath expression is. In fact both are manifestations of a single executable entity that can both push and pull sequences. I might be wrong but I am betting that this is the XQuery expression model and that in the end our differences may be purely syntactic!

Apologies if I seemed disrespectful on the deadline issue. I understand the need to get off the fence and I hope that this proposal offers a conceptual simplification and strategy that might bring us more quickly to a more powerful solution. Besides, why should the XQuery folks have all the fun?

Best Wishes

David Holmes

----- Original Message -----
From: Kay, Michael
To: David Holmes ; public-qt-comments@w3.org
Sent: Tuesday, November 26, 2002 2:07 PM
Subject: RE: Resolution of XSLT issue 99: Constructing Sequences in XSLT

Thanks for the comment.

This wasn't an easy decision to make, we revisited it several times and there were good arguments on both sides.

From my point of view, the compelling arguments in favour of the resolution that we decided on were:

* we have a two language model, and it is desirable to avoid duplication between the two languages. That means there should be a clear separation of responsibilities between the two languages. There are various possible boundaries that one could draw, but to my mind the most natural is that XSLT (with its XML-based syntax) is good for creating nodes, while XPath (with its algebraic syntax) is good for computing values; the distinction between nodes and values is a very strong one in the data model. A separation of concerns based on what objects each language manipulates is more likely to work than a split based on functionality.

* it is important to have absolute clarity about whether two adjacent instructions in the stylesheet are creating two sibling nodes (which might be merged, if they are text nodes) or two values in a sequence (which will not be merged). XQuery achieves this by allowing the node-construction expressions to be composed with the sequence-construction syntax. Being a two-language system, XSLT doesn't have this luxury. I don't think any of the proposals for doing sequence construction in XSLT created a clear enough distinction between sequence construction and tree construction.

* backwards compatibility is not an optional extra, nor is it something we should begrudge. We have inherited the stewardship of a successful language, and we need to respect the essential characteristics of the language. Changing the type system is radical enough, but changing the very clear processing model - in which the stylesheet tree acts very directly as a template for the result tree - would in my view be a step too far.

The asymmetry between XSLT and XPath is deliberate. We regard XSLT as a language that is closed with respect to trees, in the same way as SQL is closed with respect to relations. Within that we have a sublanguage that is closed over sequences of the 19 primitive types. This is a similar model to SQL, which does arithmetic and boolean operations on working data, but always delivers tables as its final result.

As far as xsl:value-of is concerned (and AVTs equally), we have adopted the same solution as XQuery (generally, in the model for constructing typed trees, we decided to let the XQuery WG beat the path through the jungle). Like XQuery, we found it difficult to come up with a model for putting typed values on trees directly, rather than going through the two-stage process (at least conceptually) of taking the string value and then validating it using a schema processor. This was a long and difficult debate.

Thanks for your comments, anyway. It's good to force us to stand back from the detail and ask ourselves why the big decisions were made they were. Deadlines didn't come into it, except from the viewpoint that you can't sit on the fence for ever.

Michael Kay

Received on Wednesday, 27 November 2002 04:14:09 UTC