Re: HTML/XML Task Force Minutes 18 January 2011 from Sam Ruby on 2011-01-21 (public-html-xml@w3.org from January 2011)

From: Sam Ruby <rubys@intertwingly.net>
Date: Fri, 21 Jan 2011 09:17:03 -0500
To: Michael Kay <mike@saxonica.com>
CC: Kurt Cagle <kurt.cagle@gmail.com>, Norman Walsh <ndw@nwalsh.com>, public-html-xml@w3.org
Message-ID: <4D39955F.4000008@intertwingly.net>

On 01/21/2011 03:26 AM, Michael Kay wrote:
> On 21/01/2011 00:27, Kurt Cagle wrote:
>> Good discussion and some interesting points. It occurred to me that
>> there may be yet another use case:
>>
>> Are there applications that should only be viewed as being workable
>> within XHTML and not HTML? Or, to put it another way, is there an
>> upper level of complexity beyond which the benefit of trying to fit an
>> XML vocabulary into HTML is simply not worth the effort? I see this as
>> a limiting case to determine where the boundaries are between the two
>> versions of the language (for instance, it may very well be that
>> XForms is simply not a viable proposition for HTML).
>>
>> Kurt Cagle
>> XML Architect
>> /Lockheed / US National Archives ERA Project/
>
> I've been wondering the same kind of thing. In fact I've been wondering
> - why exactly would I choose to serialize my content as HTML5 rather
> than XHTML(5?), given that it's not hand-authored?

While I differ with your terminology, I've come to a similar conclusion 
-- when dealing with situations where the content pipelines include a 
DOM(*).  Hence, that's why this reply will focus on the differences, 
even though I largely agree with you.

The difference between HTML5 and XHTML5 is the mime type used.  Stripped 
of a mime type, you can't take a look at the content of my web blog and 
determine whether it is HTML5 or XHTML5.  All that can be determined is 
that the page is valid under either interpretation, and that the 
differences in the DOM produced don't have any visible effect.

So... the real questions are:

(1) Why would I NOT serve my content as application/xhtml+xml?  Possible 
answers include ability to produce useful results with IE8, and 
robustness in the face of well-formedness errors.

(2) Why would I NOT quote all of my attributes and explicitly include 
such optional tags as <html> and </html>?  Possible answers include 
savings of bytes, and not looking "cool" to the new kids.

Those are the big ones.  Depending on the answers, there are secondary 
issues to work out.  Handling of script tags is an example.  Even empty 
ones need to be explicitly closed when content is served as text/html. 
If non-empty, scripts that make use of the less-than sign (commonly used 
in the idiom for iterating over an array) are problematic.  Choices 
include making such scripts external, or making use of (frankly ugly) 
workarounds such as:

<script>//<![CDATA[
... your script here...
//]]></script>

> Michael Kay
> Saxonica

- Sam Ruby

(*) Templates are a common scenario where the conclusion is radically 
different.  My weblog is formed using templates, and served as 
application/xhtml+xml.  This is not something that I recommend others 
emulate unless they wish to make a SIGNIFICANT investment in time in 
making it work.

Received on Friday, 21 January 2011 14:17:37 UTC