- From: Lachlan Hunt <lachlan.hunt@iinet.net.au>
- Date: Tue, 09 Nov 2004 09:23:35 +1100
- To: trejkaz@xaoza.net
- CC: James Cerra <jfcst24_public@yahoo.com>, www-html@w3.org
Trejkaz Xaoza wrote: > On Sun, 7 Nov 2004 09:48, James Cerra wrote: >> What are the recommendations for >> identifying the document's type when MIME or HTTP is >> not available? > > If it starts with "<?xml" it's an XML document. If it is then in the XHTML1 > namespace, it's XHTML1. If it's in the XHTML2 namespace, it's XHTML2. That is not always reliable. Hixie has explained [1] in detail, about the cases where that will not work. Although, technically, the following description was talking about sniffing documents that were sent as text/html, similar rules should apply where the MIME information is not available elsewhere. I'd recommend you do as Anne already mentioned, and use the File extension like Mozilla does. ---- + You can't sniff for the five characters "<?xml" because: - The <?xml ... ?> header is optional per Appendix C, and it is recommended not to include it as it causes IE6 to trigger quirks mode. - SGML can also contain PIs (see the example below). ... e.g. what language is this text/html document in?: <?xml this is not?> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" [ <!-- SYSTEM "not XHTML" --> ]> <!-- -- --> This is a comment. This document is not XHTML. <html xmlns="http://www.w3.org/1999/xhtml"/> Ok, I'm done now. --> <html> <title> Need a title in HTML4! </title> <p> This is a valid HTML4 document. </html> ... * The HTML working group said that UAs should not do this: http://lists.w3.org/Archives/Public/www-html/2000Sep/0024.html ---- [1] http://hixie.ch/advocacy/xhtml -- Lachlan Hunt http://lachy.id.au/ http://GetFirefox.com/ Rediscover the Web http://SpreadFirefox.com/ Igniting the Web
Received on Monday, 8 November 2004 22:24:16 UTC