RE: changing final output while writing temporary tree from Kay, Michael on 2003-01-28 (public-qt-comments@w3.org from January 2003)

From: Kay, Michael <Michael.Kay@softwareag.com>
Date: Tue, 28 Jan 2003 14:30:47 +0100
To: Andre Cusson <ac@hyperbase.com>, "Kay, Michael" <Michael.Kay@softwareag.com>, public-qt-comments@w3.org
Cc: michael.h.kay@ntlworld.com, saxon-help@lists.sourceforge.net
Message-ID: <DFF2AC9E3583D511A21F0008C7E621060453DFC9@daemsg02.software-ag.de>
Thanks.

The reason we produce public working drafts and the reason that I ship an
early implementation of the drafts is precisely to get this kind of
feedback.

I think we need to take a look at the algorithms you are using, and come to
a position on whether we regard this as being an appropriate use case for
the language or not. It looks as if we have some studying to do.

Michael Kay


> -----Original Message-----
> From: Andre Cusson [mailto:ac@hyperbase.com] 
> Sent: 28 January 2003 05:57
> To: Kay, Michael; public-qt-comments@w3.org
> Cc: michael.h.kay@ntlworld.com; saxon-help@lists.sourceforge.net
> Subject: RE: changing final output while writing temporary tree
> 
> 
> Hi,
> 
> Thank you again for your time and interest.
> 
> I would like to sit down with you and show you what we are 
> doing but since 
> that is not really possible now and since the code base is 
> relatively large 
> (and complex), I will try to describe some typical problems 
> we are trying 
> to solve, using result-document.  In essence, it is quite simple:
> 
> 1. The principle : a possibly large and complex source xml 
> document that 
> needs to be traversed recursively and from which pages (ex: 
> web pages) are 
> to be published.
> 
> 2.. A publishing use case : I have a source XML document of , 
> lets say 50 
> to 100K lines, describing the structured content of a scientific 
> information web site, organized in volumes, sections, 
> chapters, and pages 
> (which are of course also organized into sections, sub-sections, 
> paragraphs, etc).  Further, the page contents, gathered from 
> the source 
> document have to be merged with XML defined layouts that 
> describe the page 
> layout.  Every time a page is to be published, the 
> page-publish template is 
> called and passed the corresponding contents.  It then merges 
> the contents 
> into specified layout(s) and then serializes the page to 
> html.  Note: this 
> is again a simplified model of a section of the application, 
> the publishing 
> module.
> 
> As the publishing module is traversing the source document, 
> it figures out 
> what needs to be done and calls the page-publish template as 
> required.  Many cases happen.  A typical one could be, for 
> example, a test 
> page, a page that offers a test to the user:  so we start 
> building the page 
> tree and doing all kinds of calculations for the page,  on 
> this page we 
> also find information to build the answer page which uses a 
> lot of the same 
> data so we build the answer page, publish it and go on with 
> the publishing 
> of the test page, building hyperlinks and navigation between 
> the pages, and 
> so on.  The only issue is that since we are traversing down a 
> tree, the 
> leaves get output first and the root, last.
> 
> 3. O-R mapping use case: Another example could come from our 
> O-R mapping 
> tool where we take a tree, any tree and map its contents to a 
> series of 
> hyperlinked editing forms, each with a table with the 
> children (or a group 
> of them) of a node and where each child that has children 
> will require 
> additional sub-pages, and so on recursively. So, as I am 
> building page 1, I 
> find that I also need to do pages 1a, 1b and 1c (all with 
> bidirectional 
> links (xpointers) between them).  Since everything is 
> recursive, as I start 
> building page 1a, but now I find that I also have to build 
> sub-pages 1a1, 
> 1a2 and 1a3 and so on.
> 
> Every time that I have a page ready, I publish it but, because of 
> recursion, bottom pages (tree leaves) get publish first and 
> the root node 
> is still being build at that time so the result-document 
> instruction bombs out.
> 
> 
> The way I see this, we are not using a side-effect, we are 
> doing plain 
> recursive processing, with no tricks or anything twisted.  
> The fact that 
> there is a tree (final) in a tree (temporary) can hardly (or maybe 
> academically) be called a side-effect.  In my view and for 
> the uses we have 
> of this, a tree in a tree is recursion.  Is recursion a side effect ?
> 
> Now, I don't doubt that some people may want to abuse 
> recursion or the 
> result-document instruction and I believe that we should do 
> everything 
> reasonable to prevent these abuses, short of preventing the 
> development of 
> useful applications.  Personally, the price to pay for 
> side-effects is to 
> high for me to toy with.  On the other hand, blocking 
> recursion kills my 
> application.
> 
> I am surprised that something so natural to (streaming) XSLT like 
> traversing a tree structure and building web pages as you go has not 
> brought up this issue before and I start to wonder, what are 
> others doing 
> with XSLT?
> 
> Theoretically, I could process the structure iteratively but 
> I would loose 
> performance, maintainability and I would get sick looking at the code 
> because of its ugliness and illogical structure.  Comparatively, 
> (academical) side-effects are almost nice.
> 
> Ever since we found out about the new restriction, a few 
> weeks ago, I have 
> been trying to find a viable alternative but I have not yet.
> If you can think or suggest me one, I will be happy to try 
> it. If we can't, XSLT 2.0 might run into a problem when 
> people start using it 
> to generate web pages as opposed to simply rendering them or 
> building them 
> from flat structures.
> 
> For now, and until a viable solution is found, the only way 
> we can advance 
> and deliver a product to our users is to stick to Saxon 
> v6.5.2 and forget 
> about XSLT 2.0.
> It is a very high price to pay in the long run but it may be 
> better than 
> loosing everything.
> 
> I hope that the case is getting clearer and I will be happy 
> to provide you 
> with additional information, try things out, talk on the 
> phone or anything 
> that will help resolve this.
> I also hope that the mix of my passion and my frustration 
> does not blur the 
> issues.
> 
> I thank you for everything and especially for your interest. 
> Please take care.
> 
> Regards,
> Andre Cusson
> ac@hyperbase.com
> 01 Communications Inc.
> 
> PS: Whether I invoke result-document while building a 
> with-param tree or I 
> do it in a variable and then pass the variable as a parameter 
> does not 
> improve the problem because now I have additional variables 
> (which are 
> temporary trees also) and more "non streamable" steps, but I 
> still need to 
> recursively generate pages (from temporary trees).
> 
> Recursion is designed to handle problems that are cannot 
> (easily) be solved 
> otherwise and it seems to me that generating html pages from 
> traversing a 
> hierarchical structure is one of them.
> 
> I am very sorry that the committee might already have spent serious 
> resources debating (complex) output tree issues and that I am 
> trying to 
> stir things up again but the way I see it, either I am wrong 
> and there is a 
> better way to do this and I will be happy and I will grow as 
> a programmer 
> if I can find/learn it, or I am right and it is in XSLT's 
> best interest to 
> revisit the issue.  Consequently, If I have not been clear 
> enough in my 
> description of the problem, or anything else, please advise 
> me and I will 
> quickly try to fix up or provide more info.
> 
> 
> PSS: we have a small test portal with some static html pages 
> generated like 
> described above, currently on the web.  One of the sites 
> there is a complex 
> document of close to 1000 web pages, dedicated to a 
> generative approach to 
> music theory.  This site can be accessed through 
> www.musicnovatory.com.  Please keep in mind that both the 
> sites and the 
> tools to generate them are still in construction.  Please 
> also note that 
> everything, including all portals, sites, pages, navigation, 
> sitemaps, etc 
> are automatically generated from pure content XML source 
> documents (and XML 
> layouts), all with XSLT.
> 
> 
> 
> 
> At 02:42 AM 1/28/2003 +0100, Kay, Michael wrote:
> >I think that the mental model we have used for multiple 
> result trees is 
> >that there is conceptually one combined tree, and that the actual 
> >result trees all have to be subtrees of this combined tree. The 
> >stylesheet logic for writing multiple result documents is therefore 
> >exactly the same as you would write if you were generating 
> the combined 
> >tree. If you can write the logic to produce the combined 
> tree, then it 
> >is trivial to modify it to produce multiple trees. What the facility 
> >for multiple result trees does not do is to free you from 
> the need to 
> >produce output in "result tree order" - that is, the order of 
> >instructions in the stylesheet reflects the order of nodes 
> written to 
> >the result.
> >
> >Its true that the restriction on when xsl:result-document 
> can be used 
> >appears somewhat draconian, and the problem it is solving 
> seems rather 
> >small in comparison. Arguably, we are applying the "no-side-effects" 
> >philosophy of the language rather strictly in this case. I'm sure we 
> >could relax the rules by leaving some edge cases implementation 
> >defined, but on the whole we have been reluctant to do that.
> >
> >We did have a different solution to this problem in earlier 
> drafts, but 
> >it required extra information in the data model and was very hard to 
> >understand. It still imposed substantial restrictions. The model was 
> >that a result document was effectively a child node attached to some 
> >other tree, invisible to XPath expressions, but still linked in the 
> >sense that the tree was written out only if and when the parent node 
> >was written to a result tree.
> >
> >We spent some time debating this and I'm reluctant to re-open it. 
> >Rather, I'd like to see whether we can use your problem as a 
> test case 
> >to see whether the existing language is up to solving the 
> problem. In 
> >the example you show, I find it very hard to understand why you are 
> >trying to produce a result document as a side-effect of 
> calculating a 
> >parameter to a named template. It seems a very odd way to do things, 
> >though I'm sure it makes sense to you.
> >
> >Michael Kay
>
Received on Tuesday, 28 January 2003 08:31:07 UTC