- From: Stefano Mazzocchi <stefano@apache.org>
- Date: Tue, 06 Apr 1999 12:32:09 +0200
- To: The Cocoon Project <cocoon@list.working-dogs.com>, xsl-editors@w3.org, Tim Berners-Lee <timbl@w3.org>, Tim Bray <tbray@textuality.com>
Hello, I firstly want to apologize for the intrusion and for bringing this to your attention. BTW, I do hope you'll find this note valuable enough to beg my pardon. Sincerely, Stefano Mazzocchi <stefano@apache.org> Java Apache Project coordinator --------------------------------O------------------------------------- A Global Web Publishing Framework --------------------------------- by Stefano Mazzocchi <stefano@apache.org> Introduction ------------ Since the release of the second XSL working draft, I've been thinking about a way to merge such interesting technologies into current web publishing frameworks based mainly on HTML. The Cocoon Project (http://java.apache.org/cocoon/) was created with the ambitious goal to change the way web content is created, distributed and, last but not least, maintained. The XSL ability to separate content and style on different files as well as their required knowledge on different people (or working groups) pushed the project to be _very_ successful and to reach wide audience in a few weeks. It was clearly recognized the need for a way to integrate server side technologies such as dynamic content generation with the XSL framework. Following the same model, the Cocoon Project is trying to define an "eXtensible Logicsheet Language" (XLogic) that would integrate server side dynamism and batch capabilities to the tree construction part of XSL, either extending the XSL specification or cloning it. This note is mainly written to express my personal feelings about the evolution of a "global web publishing framework" that should incorporate not only language guidelines, but also software architectures and implementation suggestions. I do believe that W3C is doing an _outstanding_ job in evolving the web into a truly knowledgeable distributed information system, but I also believe that implementation guidelines on both client and server sides are mostly lacking (DOM is the first spec in this direction). In this paper, I outline my visions for a global web publishing framework and I integrate my knowledge on server side dynamic content generation as well as my experience in real-life XSL deployment. Being Cocoon's author and Cocoon project coordinator, I hope my implementations stand as a proof of concept for this note. The actual XSL model -------------------- To follow the CSS model and to be able to create the XML equivalent of stylesheets, the XSL specification is actually the repository for three different technologies: - tree contruction - patterns - formatting objects These three technologies would be best merged into a single specification if (and only if) their use is restricted only to the operations described by the XSL goals. In this note, I will underline how these technologies are well suited for other uses that are not covered by the XSL picture. Overlapping goals ----------------- The first major overlapping region is the over covered by both the "XSL patterns" and the yet-to-be-defined "XML query language". The XSL WG already stated how XSL should continue to rule on patterns over any XQL specification. This friction proves that patterns should be defined indipendently, expecially because they would be a very valuable resource for both XML programmatic handling by DOM processors, as data query language and other usages. Formatting objects are a specific namespace included in the spec and it does not have any particual need to be hosted by the same specification as the tree construction part. Expecially when very few are supporting the FO model in XSL processors and, when it's done, the support it's very limited. FO appear as a "plus" for XSL processors. This poses a big risk of platform fragmentation when the FO part is not used because not every processor implements it. A different model ----------------- I think the solution for the yet-to-become-evident problems of the XSL specification would be to separate the three parts in different specifications, while loosing the stylesheet model which is, in XML, very misleading. This is the picture I propose: a) The tree construction part is separated into an eXtensible Tranformation Language (XTL) which is able to transform any well-formed XML document into a valid SGML document. In the case the created SGML document is still well-formed XML, other transformation iterations can be applied. b) XQL + XPoint + XLink are used for any internal reference to the XML document both by the transformation processor or any specific XML processors written following the Cocoon/OpenXML model. c) The Formatting Object Language is used as a page formattation language and its defined in its own specification clearly separated from the transformation part. d) A Postscript-in-XML (XPS) language is defined. XTL processors would be able to "process" FO documents into XPS documents which may be directy feed into printers or browsers. This language will be aimed to be the common language of 2D renderers. Benefits of the new model ------------------------- There are many different benefits from the my proposed "transformation-based" approach: 1) Knowledge regions are better separated (better learning curve). 2) XTL creates the glue between XML (both standardized namespaces or personally defined DTDs) and any SGML file, focusing on the ability to "transform" one representation into another adding full programmatic capabilities. This is obviously modeled after the XSL tree construction model. 3) the XTL model adds the ability to include dynamic parameters to the transformation. This would allow user input (either at batch time or on web-request time) to influence the tranformation process. For example, a <user-counter/> tag transformation is influenced by the number of times the specific user has requested the page while the <general-counter/> tag by the number of total requests. Batch processing, for example, would allow the HTML rendering of a complete web site starting from the root page and following links. 4) The creation of a complete 2D description language using the XML syntax would allow the creation of single browsers to be able to represent _any_ XML DTD, given the right XML-to-XPS transformation sheet. For example, a FO-to-XPS XTL file should be included in the FO specification and define how the FO is mapped into the more general XPS. The human readability of XPS should be of minimal importance since XPS files will be almost totally machine generated thru the transformation steps. A browser for everything ------------------------ In such a picture, the ability to browse a world of hyperlinked, distributed, indipendend documents and document definitions is achieved with specific software tools and semantic "glue" between the different information domains. For example, today, the ability to understand and correctly render the MathML language is given by the direct implementation of these rendering capabilities. In this particular case, XML is no different from an extended HTML. Using the tranformation model, the MathML language is defined with a default MathML-to-XPS language. An XPS-aware browser would be able to apply the tranformation file before passing it to the XPS rendering engine, or either download the "precompiled to XPS" MathML tags. This model clearly extends the XSL TC-FO pair to a wider level of applicability. In creating a new document definition, one is able to "connect" to the "web of knowledge" by simply creating the XTL file that links its specific DTD/namespace to an available one. For example, my own web DTD would transform into a FO namespaced document (the equivalent of current XSL operation), which is then "compiled" into XPS and sent to the browser. If some user parameters are needed during the tranformations, the processing is done at request-time either on the server side, on the client side or mixed (depending on the software available). If no or only static parameters are needed (static pages), the processing is done at a batch level and the page is compiled into it's most useful representation (Note: since XPS compilation may generate big files, this process would more likely be done on the client side). Conclusions ----------- I strongly believe XSL to be a very important step for the creation of a usable "web of knowledge", but I'm also worried about the possibility that such language does not meet the requirements of a global web publishing framework and poses limitations (expecially on server side extensibility) that would be rather hard to overcome. Even if I do understand how much effort has been put into the XSL model, in this paper I outlined a possible web publishing framework that would solve many of the things that the current model is failing to support. It also shows how the stylesheet model, probabably too much influenced by the CSS experience, may lead to specification misunderstandings and friction between different aspects of web technologies in general. As a final remark, I must specify that this note reflect my own personal and humble opinions and would like to be a starting point for the creation of an active collaboration between the Apache Project in general and the Web Consortium that would allow standards to be defined and open implementations to support them. Copyright (c) 1999 by Stefano Mazzocchi (stefano@apache.org). All rights reserved. -- Stefano Mazzocchi A language that doesn't affect the way you think about programming, is not worth knowing. <stefano@apache.org> Alan J. Perlis ---------------------------------------------------------------------
Received on Tuesday, 6 April 1999 06:28:33 UTC