Unexpected behavoir converting XHTML -> HTML

On my Apache server I use content negotiation to send out the XHTML pages, so 
if the browser recognises it (anything except M$IE) it gets 
'application/xhtml+xml", otherwise the pages are served as "text/html". I do 
this by having a bunch of files named like foo.xhtml and symlink to them like 
foo.html -> foo.xhtml.

Now I'm wondering if serving XHTML with the MIME type "text/html" is such a 
good idea, so I want to change foo.html to be in HTML 4.01 format, rather 
than a symlink to an XHTML page.

So I run:
$ tidy -ashtml -qi foo.xhtml > foo.html

But the file created is NOT really HTML, for a start it has the XML headers:

?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN">
<html lang="en-GB">
<head>

Which I don't think should be there for HTML 4.01.
It is also missing a character encoding, while I think it should have used 
utf8, since that is the default for XML documents.

Received on Saturday, 14 August 2004 11:42:37 UTC