W3C home > Mailing lists > Public > public-html@w3.org > July 2008

Re: ISSUE-54 (html5-doctype-vs-xslt): XSLT 1.0 can not generate HTML5 documents [HTML 5 spec]

From: Jirka Kosek <jirka@kosek.cz>
Date: Fri, 04 Jul 2008 23:44:38 +0200
Message-ID: <486E99C6.2060401@kosek.cz>
To: Henri Sivonen <hsivonen@iki.fi>
CC: HTML Issue Tracking WG <public-html@w3.org>
Henri Sivonen wrote:

> I disagree with the simplified framing of the issue, since it gives the 
> wrong idea of how little fixing is needed and where the sensible place 
> for the fix is. The doctype is the least of the problems with XSLT and 
> HTML5.

Hi Henri, actually there are two issues. One is very simple -- how to
allow producing of HTML5 compliant output with *existing* XSLT language
and its implementation. This issue is very important because it is very
common approach for producing HTML content. Moreover even HTML WG
charter explicitly states that "legacy implementation" of "classic HTML"
should be taken into account. And XSLT could be considered as such 
legacy application.

Of course there is second issue on which you really elaborate in your
email and this is how to extend some *future version* of XSLT language
and its implementation to support all bits of HTML5. I almost agree with
your analysis on this issue.

> HTML5 defines HTML elements to go into the 
> "http://www.w3.org/1999/xhtml" namespace in order to abstract away the 
> difference of serialization from programs that operate on a 
> namespace-aware tree representation. HTML5 parsers that expose XML APIs 
> to allow unified application internals regardless of whether the data 
> came in as text/html or application/xhtml+xml put HTML elements in the 
> "http://www.w3.org/1999/xhtml" per spec. Moreover, with support for 
> MathML and SVG, there can also be element nodes in those namespaces. 
> Programs operating on trees shouldn't have to have different code 
> throughout depending on whether the program is targeted at text/html or 
> application/xhtml+xml.

On the other hand, in past HTML (4 and previous) has not been using
anything like namespaces while XHTML used this concept. If you have
existing XSLT code that emits HTML and you want to use few new elements
introduced in HTML5 why you should also start thinking about namespaces?
You simply want to add those few new tags into your stylesheet and 
modify public identifier to make it clear that you are using brand new 
HTML5 language.

So, your idea sounds perfectly reasonable and I think once there is 
something like HTML5 output method in XSLT and HTML5 is widely deployed 
everyone should use such approach. But we are not there yet, we can 
propose such academically clean approach, but at the same time we should 
pragmatically solve todays' problems.

> throughout the XSLT program and pass-through input. (If you put elements 
> in the wrong namespace throughout your code base of XSLT programs, 
> upgrading to XHTML becomes even harder than it already is for other 
> reasons.)

I don't think so. It is very easy to change namespace of output. For 
namespaced elements you can use xsl:namespace-alias and if you have 
elements in no namespace (for example HTML), it is sufficient to add 
xmlns="http://www.w3.org/1999/xhtml" to the root element of your stylesheet.

> I think the right way to deal with this is to define an HTML5 output 
> method for XSLT. 

I agree, and I'm willing to manage that next version of XSLT will have 
such method. Of course this means that serialization of HTML5 and other 
related issues are resolved before. Is this part of HTML5 stable or are 
there any changes expected?

> In the interim, the right way is to take DOM or SAX 
> output from the XSLT processor and to run a DOM-to-HTML5 or SAX-to-HTML5 
> serializer outside the XSLT processor. (I intend to ship a foreign 
> content-enhanced serializer with the next release of XSLT4HTML5.)

Please, go back to the ground. If HTML5 should be successful it should 
provide the lowest entry barrier for existing content and existing ways 
of producing content.

If there will be allowed public identifier after <!DOCTYLE HTML in HTML5 
then you can very easily upgrade existing XSLT stylesheets that produce 
HTML4 to produce HTML5. You can simply change

<xsl:output method="html"/>

or

<xsl:output method="html" doctype-public="-//W3C//DTD HTML 4.0 
Transitional//EN"/>

to

<xsl:output method="html" doctype-public="-//W3C//DTD HTML 5//EN"/>

At the same time your proposal requires changes to XSLT specification, 
to XSLT implementations (and there are more XSLT engines then web 
browsers) and more radical changes in existing XSLT code -- adding 
elements into XHTML namespace. I simply can't see advantage of your 
proposal here.

Up to now, no one provided single argument against allowing this 
optional "PUBLIC ..." part of !DOCTYPE. So what's the problem?

			Jirka


-- 
------------------------------------------------------------------
   Jirka Kosek      e-mail: jirka@kosek.cz      http://xmlguru.cz
------------------------------------------------------------------
        Professional XML consulting and training services
   DocBook customization, custom XSLT/XSL-FO document processing
------------------------------------------------------------------
  OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
------------------------------------------------------------------


Received on Friday, 4 July 2008 21:45:18 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:56 UTC