W3C home > Mailing lists > Public > www-validator@w3.org > April 2008

Re: Fallback to UTF-8

From: olivier Thereaux <ot@w3.org>
Date: Mon, 28 Apr 2008 12:47:34 +0900
Message-Id: <598D94B2-4F13-45F4-B197-E0F03D6EFA93@w3.org>
To: W3C Validator Community <www-validator@w3.org>

On 24-Apr-08, at 11:38 PM, Andreas Prilop wrote:
> Which kind of patch do you mean?
> I just ask to change the default from UTF-8 to ISO-8859-1.

In a few years developing various software projects, I have learned to  
be very wary of the word "just". Any occurence of a suggestion that a  
software "just has to do this or that" usually means a lot of  
complexity and difficulty for whoever actually has to implement. I  
suggest banning this term from your RFEs or bug reports.

That said, as the long thread has shown, there are a number of  
candidates for default:
* utf-8, because it is the future-looking encoding, also appropriate  
for most international content. It is also what authors are strongly  
encouraged to use today, and as such, the validator is a tool that  
should favor this practice.
* windows-1252, which appears to be a safe default for a lot of  
content on the web today, and which the HTML5 specification suggests  
as a fallback for UAs trying to parse legacy content
* iso-8859-1, not because it's a proper encoding for most languages,  
but because it has (unfortunately) been set as default in a number of  

We can either argue forever on which default is the right one (as  
parts of this thread - and many a sterile discussion before -  have  
shown, alas) or have implementations try the three. The latter is  
obviously not very performant, but hopefully should be helpful for  
document authors.

I think (as I already stated in the past) that the latter may be a  
fair solution. I have therefore tried implementing it in the  
development validator.

The patch is at:

And can be tested, as usual, at:

Any constructive feedback would be welcome.

Received on Monday, 28 April 2008 03:48:07 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:59:07 UTC