Re: new TAG issue TagSoupIntegration-54 from Mark Baker on 2006-10-26 (www-tag@w3.org from October 2006)

From: Mark Baker <distobj@acm.org>
Date: Thu, 26 Oct 2006 00:42:03 -0400
To: "Dan Connolly" <connolly@w3.org>
Cc: "Henry S. Thompson" <ht@inf.ed.ac.uk>, www-tag@w3.org
Message-ID: <c70bc85d0610252142v3d161df7m59194511ebee7882@mail.gmail.com>

Hixie writes;
> > This isn't hypothetical; it is the situation we are in today with XHTML
> > documents sent as text/html. UAs cannot use XML parsers to parse these
> > XHTML-sent-as-text/html documents, even if they could find a way to detect
> > them, because a huge fraction of such documents are ill-formed and would
> > thus render *worse* in new UAs than in legacy UAs.

I don't understand why this is an issue.  The authoritative metadata
finding makes it clear that the media type determines how the document
is to be interpreted, and nothing in RFC 2854 (text/html) or the HTML
family of specifications suggests that running a text/html document
through an XML parser would yield anything which meaningfully
represents what the sender was trying to convey.

FWIW, I spearheaded the creation of an XHTML media type in large part
because I was concerned that without one, HTML UAs would find
themselves having to deal with XML/XHTML "isms" in their HTML code
path.  If Ian's correct about how browsers work today - which I assume
he is - it seems that they decided to tackle those issues anyhow.
Ouch.  Chalk that up to another tax on content sniffing, I reckon.

Mark.

Received on Thursday, 26 October 2006 04:42:13 UTC