- From: Lachlan Hunt <lachlan.hunt@iinet.net.au>
- Date: Wed, 09 Feb 2005 16:57:35 +1100
- To: www-validator@w3.org
Everett, Alex wrote: > Is there something wrong with the validator or the source code? It is a character encoding problem. > Also, sometimes the reported error changes to question marks instead of blocks. That is because the characters are invalid, and the validator is indicating the position of those errornous characters by replacing them with U+FFFD (REPLACEMENT CHARACTER). > Website: > https://security.okstate.edu/sso/index.php The HTTP headers for this site indicate the character encoding as ISO-8859-1: Content-Type: text/html; charset=ISO-8859-1 The character being complained about has the code position 146, which is a control character within the ISO-8859-1 character repertoir. Althouth popular user agents interpret it as U+2019 RIGHT SINGLE QUOTATION MARK, that character does not exist in ISO-8859-1. It does, however, exist in Windows-1252 and is one of the differences between the two encodings [1]. Solutions: Declare the character encoding as Windows-1252, but the use of proprietary character encodings is not recommended on the WWW. Replace the characters with numeric character references: ’ or ’ for that quotation mark. This is the easiest recommended solution. Convert the documents to UTF-8. It's not the easiest solution, but it is the most recommended. There are several references to help you do this including my own 3 part guide to unicode [2] or Jukka Korpela's excellent character related material [3]. [1] http://www.cs.tut.fi/~jkorpela/www/windows-chars.html [2] http://lachy.id.au/blogs/log/2004/12/guide-to-unicode-part-1 [3] http://www.cs.tut.fi/~jkorpela/chars/index.html -- Lachlan Hunt http://lachy.id.au/ http://GetFirefox.com/ Rediscover the Web http://SpreadFirefox.com/ Igniting the Web
Received on Wednesday, 9 February 2005 05:57:41 UTC