- From: Herr Christian Wolfgang Hujer <Christian.Hujer@itcqis.com>
- Date: Sat, 8 Mar 2003 20:18:39 +0100
- To: Jim Dabell <jim-www-html@jimdabell.com>, www-html@w3.org, "Jesper Tverskov" <jesper.tverskov@mail.tele.dk>, "basil crow" <basilcrow@cox.net>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, On Saturday, 8th of March, Jesper Tverskov, basil crow and Jim Dabell: > [discussion about current HTML versions and MIME types] I want to confirm that it is corrupt to send XHTML 1.1 as MIME Type text/html. The Internet and WWW are based on RFCs and Recommendations. If you don't follow all (currently valid) of them, why follow them at all? The MIME Type text/html is for SGML based HTML. XHTML 1.0 is an exception for that rule. This exception is made only to enable a smoother transition from old SGML based HTML / tag soup HTML to well-formed XML based valid XHTML. The correct MIME Type for sending XHTML 1.1 is application/xhtml+xml, not text/html, as already said by Jim. But Internet Explorer doesn't accept application/xhtml+xml. Internet Explorer knows nothing about XHTML and even less about the MIME Type application/xhtml+xml. If you correctly serve XHTML 1.1 (only, without a HTML 4.01 alternative), Internet Explorer Users won't be able to see your content but instead be presented a download dialog. The solution for this problem follows: I want to add that you could use the following scenario to deliver your XHTML 1.1 content correctly. Use an XSLT nearly-identity transformation which converts XHTML 1.1 to HTML 4.01. Of course, this is not possible if you're using the Ruby Module, but I assume you aren't. Use .xhtml as file extension for XHTML 1.1. Use .html as file extension for HTML 4.01. Now configure your webserver in a way that it delivers the .xhtml files as MIME Type application/xhtml+xml to those browsers that accept XHTML (these browsers announce this to the server by using application/xhtml+xml as part of the list value for the HTTP Accept: header field). Other browsers will be delivered the HTML 4.01 version. To safely do so, currently the HTML version needs a higher priority because the reload function of MS Internet Explorer is broken and sends Accept: */* instead of a qualified list. For Apache, use the following lines in your .htaccess file to do so: Options +MultiViews AddType text/html;charset=US-ASCII .html AddType application/xhtml+xml;charset=UTF-8;qs=0.999 .xhtml This example assumes that text/html is delivered in US-ASCII and application/xhtml+xml is delivered in UTF-8, which is recommended because: Elder HTML versions stated ISO-8859-1 being their default charset. Newer HTML versions state ISO-10646 / Unicode being their default charset, similar to XML. XHTML is XML so the same default charset rules apply for both, XML and XHTML. In XML, UTF-8 is default. Of course, using US-ASCII will do well in any case, but it will increase the file size if you use characters outside the US-ASCII range. UTF-8 keeps the filesize a bit smaller in that case. For instance, the German umlaut ü (u diarhesis) doesn't exist in US-ASCII, so an entity must be used: ü, ü or ü. In UTF-8 it is the byte sequence with the binary values 1100 0011 1011 1100 (if I calculated it correctly), so only takes two bytes while the entities in this example all need 6 bytes. If you can follow the above charset example, you will be also aware of what problems can arise when serving no or the wrong default charset. My configuration example for Apache requires certain charsets to be used and then will cause no character problems in any current browser, even if the default settings of the browser are not the same as those of the documents. I myself run a Linux system configured to use UTF-8 wherever possible. So UTF-8 also is the default charset in my web browsers. Pages with a charset declaration neither in the HTTP Content-Type Header nor in the XML declaration nor in a Meta element and using characters instead of entities usually show messed up characters instead of those outside the US-ASCII range until I select the charset manually (usually ISO-8859-15). Bye - -- ITCQIS GmbH Christian Wolfgang Hujer Geschäftsführender Gesellschafter Telefon: +49 (0)89 27 37 04 37 Telefax: +49 (0)89 27 37 04 39 E-Mail: Christian.Hujer@itcqis.com WWW: http://www.itcqis.com/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE+akIRzu6h7O/MKZkRAv1yAKCar1XZTki8fQVoqpf+hVltNDxDTACfWbAk NMDgMOqmjp8gt3Qi6coiBmo= =UgeJ -----END PGP SIGNATURE-----
Received on Saturday, 8 March 2003 14:19:18 UTC