On Thu, 1 May 2008, Jukka K. Korpela wrote: > This is a good reason not to assume ISO-8859-1 in a validator, > because it leads to pointless error messages about data characters. In theory - yes. But not in practice for the W3C validator! That's the reason I have started this thread. Is this still unclear? With UTF-8 or Windows-1252 assumed, the W3C validator simply gives up and does nothing "Sorry! This document can not be checked." when it finds some byte (or byte sequence) that it cannot interpret as Windows-1252 or UTF-8. http://validator.w3.org/check?uri=www.unics.uni-hannover.de/nhtcapri/test.htm http://validator.w3.org/check?uri=www.unics.uni-hannover.de/nhtcapri/test.htm;charset=windows-1252 With ISO-8859-1 assumed, it does check and it does give a helpful error report. http://validator.w3.org/check?uri=www.unics.uni-hannover.de/nhtcapri/test.htm;charset=iso-8859-1 "This page is not Valid HTML 4.01 Strict!" "Result: Failed validation, 2 Errors" The W3C validator just reports "non SGML character number ...", which is still better than to sit there and to do nothing. http://www.unics.uni-hannover.de/nhtcapri/test.htmReceived on Friday, 2 May 2008 14:10:18 UTC
This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:59:08 UTC