Re: Detecting XHTML

> Henri Sivonen scripsit:
> > What about an that's 100% convergent with XML
> > and has a mode switch for opting into? It turns out that we
> > already have that! It's called XHTML5 and the mode switch is the
> > Content-Type: application/xhtml+xml HTTP header. Even better than some
> > yet-to-be-defined mode, it's already supported by the latest
> > versions of the top browsers (if you count IE9 as the latest version
> > of IE).
> This troubles me, because it means that in order for XHTML5 to be viewed
> in a browser as the author intended, it must be:
> 1) served from an HTTP server
> 2) on which the author can control the Content-Type: settings.
> If either of these conditions is violated, the XHTML will be processed
> as HTML.  That's bad, and there should be a document-internal flag that
> forces the HTML parser to use the XHTML parser instead.  The obvious
> candidate is an XML declaration, but I suppose you will tell me that
> there are N tag-soup documents with XML declarations on them.
> Most modern operating systems make use of file extensions of some sort,
which also have mime-type mappings, and any software that offers up a stream
of content will similarly be able to recognize these mime-types in most
cases. Frankly, the simplest solution to the problem is to treat ALL HTML as
being XHTML up to the point where it refuses to parse, at which stage you
reparse with a more lenient parser. The delay (likely no more than a small
fraction of a second) will tend to encourage the use of xml-well-formed
content for those that absolutely have to optimize.

Kurt Cagle

> --
> John Cowan
> Original line from The Warrior's Apprentice by Lois McMaster Bujold:
> "Only on Barrayar would pulling a loaded needler start a stampede toward
> one."
> English-to-Russian-to-English mangling thereof: "Only on Barrayar you risk
> to
> lose support instead of finding it when you threat with the charged
> weapon."

Received on Tuesday, 4 January 2011 20:22:37 UTC