- From: Trejkaz Xaoza <trejkaz@xaoza.net>
- Date: Tue, 9 Nov 2004 08:05:01 +1100
- To: James Cerra <jfcst24_public@yahoo.com>
- Cc: www-html@w3.org
- Message-Id: <200411090807.32309.trejkaz@xaoza.net>
On Sun, 7 Nov 2004 09:48, James Cerra wrote: > Now HTML was origionally designed for transport over > the web via HTTP and identification via MIME types. > However, there are cases where (X)HTML may be > transmitted with no MIME type information available. > e.g. Reading a file from a FAT disk or though standard > io. I'm writing a program where this type of > situation may come up. The specs are silent on the > issue, so: What are the recommendations for > identifying the document's type when MIME or HTTP is > not available? Easy enough. If it starts with "<?xml" it's an XML document. If it is then in the XHTML1 namespace, it's XHTML1. If it's in the XHTML2 namespace, it's XHTML2. You can see that Microsoft already do this to some extent with their WordML format (it shows a different icon to other XML files, even when it's named file.xml.) The file(1) command on *nix also tries to distinguish between different XML formats to determine the MIME type from the content. If it doesn't start with "<?xml" but has a DOCTYPE near the top, then it's SGML, and you perform similar rules based on what you see after it. TX -- Email: Trejkaz Xaoza <trejkaz@xaoza.net> Web site: http://xaoza.net/ Jabber ID: trejkaz@jabber.xaoza.net GPG Fingerprint: 9EEB 97D7 8F7B 7977 F39F A62C B8C7 BC8B 037E EA73
Received on Monday, 8 November 2004 21:07:14 UTC