Re: Use cases from Sam Ruby on 2011-01-05 (public-html-xml@w3.org from January 2011)

From: Sam Ruby <rubys@intertwingly.net>
Date: Wed, 05 Jan 2011 10:38:37 -0500
To: Anne van Kesteren <annevk@opera.com>
CC: public-html-xml@w3.org, Henri Sivonen <hsivonen@iki.fi>
Message-ID: <4D24907D.3060609@intertwingly.net>
On 01/05/2011 09:40 AM, Anne van Kesteren wrote:
> On Wed, 05 Jan 2011 15:26:24 +0100, Sam Ruby <rubys@intertwingly.net>
> wrote:
>> On 01/05/2011 08:31 AM, Anne van Kesteren wrote:
>>> On Wed, 05 Jan 2011 13:15:20 +0100, Sam Ruby <rubys@intertwingly.net>
>>> wrote:
>>>> Meanwhile, there is clear value in degrading gracefully by serving the
>>>> same content as text/html to clients that don't support
>>>> application/xhtml+xml, even if such clients don't get the benefit of
>>>> the full functionality.
>>>
>>> But soon all browsers will support XHTML.
>>
>> Define "soon". Define "all". Heck, define "browser" as the use cases I
>> proposed[1] deal with feed readers. Many of which attack HTML with
>> regular expressions. Including, embarrassingly enough, one of the very
>> libraries that Planet Venus depends on[2].
>>
>>> Internet Explorer not supporting XHTML was a problem for people wishing
>>> to use XHTML. But that is being solved. What other problem is there? (I
>>> would say, that XHTML is too hard, but that is not being debated.)
>>
>> I don't recommend to people that they serve XHTML unless they have a
>> compelling reason to do so; but for those that do, I recommend
>> constructing the XHTML in such a way that it can be parsed correctly
>> as HTML unless there is a compelling reason not to.
>>
>> I've given my reason in the form of a use case. One that I will point
>> out is not atypical or hypothetical. Can either you or Henri give any
>> rationale for your pushing back on this?
>
> I think that resources need to be processed unambiguously. Having
> resources processed sometimes as XML and sometimes as HTML depending on
> the user agent is very fragile and does not lead to interoperability. In
> fact, I think will lead to divergence (localized perhaps). I.e. authors
> not aware of this happening will optimize for their user agent of choice
> (likely the market leader) and e.g. use features exclusively to HTML or
> XML. (This has happened before. The difference was that instead of XML
> and HTML you had IE-flavored HTML and Netscape-flavored HTML.)

I note that you don't define soon, all, or browsers.

Additionally, there are feed formats, such as RSS 2.0, which do not 
provide a means to clearly identify the mime type of descriptions. 
Until all user agents are rewritten "properly" and all legacy document 
formats are retired, I don't believe that use cases should be rejected 
based on how an ideal world would look like any more than I believe that 
HTML5 should ignore supporting the vast corpus of existing content as a 
requirement.

There are many scenarios where consumers will have to sniff or guess 
content.  Hopefully, over time, more will converge or default to HTML5 
as the way to parse unknown or mislabeled content.  Ideally, most 
content will degrade gracefully with this choice, even if the original 
author's intent was XHTML1.0 or XHTML5.  In fact, this generally is the 
case.

Meanwhile, many users of Planet Venus are serving their content as 
text/html.  And I continue to have no way to prevent such.  Nor do I 
have any desire to do so.

>> [1] http://lists.w3.org/Archives/Public/public-html-xml/2011Jan/0025.html
>> [2] http://intertwingly.net/blog/2010/12/30/Dealing-with-HTML-in-Feeds

- Sam Ruby
Received on Wednesday, 5 January 2011 15:40:11 UTC