- From: David Dorward <david@dorward.me.uk>
- Date: Thu, 23 Mar 2006 08:27:12 +0000
- To: "tadeusz szewczyk I rebel:art" <rebelart@onreact.com>
- Cc: www-validator@w3.org
On Wed, Mar 22, 2006 at 08:31:39PM +0100, tadeusz szewczyk I rebel:art wrote: > While cleaning up my XHTML strict on my website http://onreact.com I > wanted to be very accurate. So I tested it with several > validators. I was delighted to find out the W3C Markup Validator did > not return any errors after I was done. Then I wanted to make sure > and retested with with others like the one at Validome. It says "The > Document is not valid XHTML 1.0 Strict" and that I have the > following error: "Unexpected char in row 55 and column 111; this > char is not allowed within charset (utf-8) that you use." > > Which one is right? Both - since what you have is not a validation issue and hits some somewhat contradictory parts of various specs. Your webserver fails to send a character encoding in your HTTP headers. According to the rules for XML documents if you do not have an XML prolog declaring a different character encoding then you must use UTF-8. Additionally section 5.1 of the XHTML 1.0 spec doesn't allow you to serve XHTML documents as text/html unless you "follow the guidelines set forth in Appendix C" - however Appendix C is informative, not nomative, so it is questionable as to if you have to follow it or not, and C.9. is coached in language which says "you may not want to" - if you take the wording to mean that you must follow the advice then such an XML prolog is effectively forbidden and you are restricted to using UTF-8 for your XHTML documents served as Appendix C. However, the HTTP specification states that if you don't specify a character encoding, and you are using a text/something content type, then it defaults to ISO-8859-1. Also, you have: <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1" /> Which is supposed to be somewhere HTTP servers can look to get extra HTTP headers ... but none (that I know of) do ... and some clients pay attention to them. If you want to stick to XHTML and take the advice in Appendix C, then you should convert your document to UTF-8 AND modify your server so it outputs an HTTP Content-Type header that also states you are using UTF-8. Personally, I'd swich to the better supported and less weird HTML 4.01 - and then modify the server so it claims I was using the character encoding I was already using. -- David Dorward http://dorward.me.uk
Received on Thursday, 23 March 2006 08:27:19 UTC