Re: text/html for html and xhtml from William F Hammond on 2008-04-21 (public-html@w3.org from April 2008)

From: William F Hammond <hammond@csc.albany.edu>
Date: Mon, 21 Apr 2008 11:22:50 -0400
To: public-html@w3.org, www-math@w3.org, www-svg@w3.org
Message-ID: <i7y777kzat.fsf@hilbert.math.albany.edu>

(Narrowed to w3.org lists)

Boris Zbarsky <bzbarsky@MIT.EDU> writes:

>> 1.  Many search engines appear not to look at application/xhtml+xml.
>
> That seems like a much simpler thing to fix in search engines than in
> the specification and UAs, to be honest.

Technically yes, but politically no.  Search engines seem to be market
driven.  "application/xhtml" appears so far to have an insignificant
share of the market.  Of course, that very fact is compounded by a
vicious cycle: searches do not point to the content that is there.
So why should an author post such content if it cannot be found.
Even when xhtml and pdf are on the web side by side, most search
engines seem to extract text from the pdf and pass by the xhtml even
though, for web browsing, the xhtml is far superior.

>                                            I don't see this as a
> compelling reason to add complexity to the parsing model.

Not all that complex.  Even arguendo if it is, the issue is between
one-time complexity for half a dozen user agent authors and many-time
complexity for tens of thousands of content providers, who presently
find it necessary to make burdensome arrangements for http content
negotiation.

>> 2.  Many content providers have reported that they are stranded,
>>     i.e., their contractors who receive the content by "upload" for
>>     subsequent placement under the eye of an http server do not
>>     support application/xhtml+xml.
>
> This is the argument for any type of content-type sniffing, no?

It's not.  It's merely saying that the boundary between text/html
and application/xhtml+xml is (i) artificial and (ii) not well understood
by content providers.

>> (And, of course, "text/xml" and "application/xml" are non-specific
>> mimetypes for which there is no base namespace.  They are sane content
>> channels for web browsers only when display is entirely controlled
>> with something like CSS.)
>
> Uh...  Have you tested this? ...

I hope you are not disagreeing with my characterization of the two
umbrella XML mimetypes from a standards perspective.  Long term those
mimetypes might better be handled by XML triage agents than by web
browsers.

> If you're talking about UAs other than those three that support
> application/xhtml+xml,  ...

Mozilla [Gecko], Opera, Safari, as you say, but also Amaya and
IE-with-MathPlayer, possibly others that do not come to mind.

See David Carlisle's message related to this:
http://lists.w3.org/Archives/Public/www-math/2008Apr/0190.html

                                    -- Bill

Received on Monday, 21 April 2008 15:23:25 UTC