- From: Andreas Prilop <prilop2008@trashmail.net>
- Date: Fri, 2 May 2008 16:09:31 +0200 (MEST)
- To: www-validator@w3.org
On Thu, 1 May 2008, Jukka K. Korpela wrote: > This is a good reason not to assume ISO-8859-1 in a validator, > because it leads to pointless error messages about data characters. In theory - yes. But not in practice for the W3C validator! That's the reason I have started this thread. Is this still unclear? With UTF-8 or Windows-1252 assumed, the W3C validator simply gives up and does nothing "Sorry! This document can not be checked." when it finds some byte (or byte sequence) that it cannot interpret as Windows-1252 or UTF-8. http://validator.w3.org/check?uri=www.unics.uni-hannover.de/nhtcapri/test.htm http://validator.w3.org/check?uri=www.unics.uni-hannover.de/nhtcapri/test.htm;charset=windows-1252 With ISO-8859-1 assumed, it does check and it does give a helpful error report. http://validator.w3.org/check?uri=www.unics.uni-hannover.de/nhtcapri/test.htm;charset=iso-8859-1 "This page is not Valid HTML 4.01 Strict!" "Result: Failed validation, 2 Errors" The W3C validator just reports "non SGML character number ...", which is still better than to sit there and to do nothing. http://www.unics.uni-hannover.de/nhtcapri/test.htm
Received on Friday, 2 May 2008 14:10:18 UTC