Re: ISSUE-4 - versioning/DOCTYPEs from Sam Ruby on 2010-05-17 (public-html@w3.org from May 2010)

From: Sam Ruby <rubys@intertwingly.net>
Date: Mon, 17 May 2010 09:01:21 -0400
To: Henri Sivonen <hsivonen@iki.fi>
CC: public-html@w3.org, Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, Boris Zbarsky <bzbarsky@MIT.EDU>, Daniel Glazman <daniel.glazman@disruptive-innovations.com>
Message-ID: <4BF13E21.2000504@intertwingly.net>
On 05/17/2010 04:57 AM, Henri Sivonen wrote:
> "Boris Zbarsky"<bzbarsky@MIT.EDU>  wrote:
>
>> On 5/16/10 10:22 AM, Daniel Glazman wrote:
>>>> Hold on. We were just talking about wysiwyg HTML/XHTML
>>>> editors,
>> no?
>>>> Those are very much NOT text editors.
>>>
>>> Guys, since you mentioned BlueGriffon, Nvu and Kompozer and since
>>> I am the original guilty one for these three editors, let me say
>>> a word. Leif, what precisely do you miss? A dialog for polyglot
>>> documents allowing to select the editing mode when a document is
>>> loaded? A way to save a document in a given mimetype?
>>
>> I think what Leif would like is some way to indicate in-document
>> that the document should be edited in polyglot mode so that all
>> editors would automatically do that.
>
> It's unclear to me what the use case is.
>
> I'm aware of three use cases for polyglot documents:
>
> 1) Serving XHTML+SVG or XHTML+MathML or XHTML+SVG+MathML content as
> application/xhtml+xml to Gecko, WebKit, Presto and Trident+MathPlayer
> but serving the same bytes as text/html to Trident (sans MathPlayer)
> in order to be able to use SVG and/or MathML inline where supported
> but allowing the users of unextended IE still read the (X)HTML
> content of the document.
>
> 2) Serving application/xhtml+xml that doesn't use any non-HTML
> features as Gecko, WebKit and Presto as a matter of pro-XML principle
> but serving the same bytes to Trident as text/html because the
> author's pro-XML principle doesn't go far enough to exclude IE users
> from his/her audience.
>
> 3) Serving content as text/html but using an XML parser to process
> the content in a non-browser scenario where the party operating the
> XML parser has the power to make the publisher supply the content in
> a form that is safe for XML parsers.
>
> Leif, are there additional use cases that I'm missing?

As someone who serves content as application/xhtml+xml to browsers that 
support it, and the same content as text/html to browsers that don't, 
none of the descriptions above resonate with me.  Perhaps it is because 
of manner in which you chose to express these cases.

> Use case #3 is already obsolete. HTML parsers that expose
> XML-parser-compatible APIs are already available, so the content
> consumer should use an HTML parser instead of an XML parser. Since
> use case #3 is already obsolete, it's not useful to cater to the use
> case.

As someone who codes in Ruby on Rails, I can assure you that #3 is not 
obsolete.

> Use case #2 is harmful. When the document is well-formed, serving
> content as application/xhtml+xml to browsers deprives the users of
> optimizations that browsers have only for text/html. The well-known
> Gecko example used to be that the XML code path didn't support
> incremental rendering. That has been fixed, but currently in Gecko
> the XML code path doesn't benefit from speculative resource fetching.
> At least at one point in WebKit, the XML code path involved an
> additional UTF-16 to UTF-8 conversion and an additional UTF-8 to back
> to UTF-16 conversion of the content compared to the HTML code path.
> (I'm not sure if this is still the case in WebKit.) Worse, when the
> document isn't well-formed, users of application/xhtml+xml-capable
> browsers get an error message while IE users get to read the content.
> Since use case #2 is harmful, I think it is not useful to cater to
> the use case.

I'm confident that all that will be addressed in the fullness of time, 
particularly if the marketplace for browsers remains competitive.

> Use case #1 is on the way to obsolescence and while it's not obsolete
> yet, it is a specialist use case that affects very few people. This
> use case becomes obsolete either when IE versions prior to IE9 sink
> to low enough market share that authors no longer care (enabling the
> use of application/xhtml+xml unconditionally) or when versions of
> Gecko, WebKit and Presto that don't support SVG and MathML in
> text/html sink to low enough market share that authors no longer care
> (enabling the use of text/html unconditionally). My expectation is
> that WebKit and Presto implement the HTML5 parsing algorithm and the
> relatively fast upgrade cycle of Firefox, Opera, Chrome and Safari
> takes care of the old versions becoming irrelevant sooner than IE6
> through IE8 become irrelevant to authors. As for the use case not
> being obsolete quite yet, currently this use case mainly applies to
> very few specialist blogs and the Venus aggregator. Even if (X)HTML
> editors had support for this use case, the required server
> configuration tweaks would keep deployment limited to specialists.
> While I think this use case is legitimate e.g. in the context of
> Jacques Distler's blog (http://golem.ph.utexas.edu/~distler/blog/), I
> think the WG shouldn't treat this use as something that J. Random Web
> Author is going to need support for or as something that J. Random
> Web Author should attempt.
>
> In general, I get a feeling that polyglot documents have more
> intellectual appeal as a spec lawyering puzzle than they have
> practical usefulness. I think the WG shouldn't fall into the trap of
> chasing puzzle appeal instead of Solving Real Problems.

Again, as an author of polyglot documents, that description does not 
resonate with me.

As for me, I simply want to be conservative in what I send.  This is the 
first half of the robustness principle.  This enables people who have 
off-the shelf xml parsers to process my pages.  Not because they hold 
any special power over me, but simply because I enabled it.

> P.S. What does all this have to do with "versioning"? And "DOCTYPEs"
> in this context looks to me like a (bad) solution in search of a
> problem...

- Sam Ruby
Received on Monday, 17 May 2010 13:02:29 UTC