- From: olivier Thereaux <ot@w3.org>
- Date: Mon, 28 Apr 2008 10:43:01 +0900
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: W3C Validator Community <www-validator@w3.org>
Hi Henri, all. Thanks for all your thoughts on this thread. I am disappointed by some of the name calling, but overall I believe this has been an interesting and informative discussion. On 24-Apr-08, at 5:10 PM, Henri Sivonen wrote: > More precisely for text/html: > http://www.w3.org/html/wg/html5/#determining > > Step 7. defines Windows-1252 as the general default which can be > different in non-Western browser installations. Global online apps > like validators should probably stick to Windows-1252. Henri, this is an interesting and important statement in the HTML5 spec. How does the group feel about the inconsistency this created between the spec and defaults stated by other specifications, such as http://www.ietf.org/rfc/rfc2854.txt “ Section 3.7.1, defines that "media subtypes of the 'text' type are defined to have a default charset value of 'ISO-8859-1'".” (ditto RFC 2616) This is the inconsistency at the core of the issue, isn't it. I heard that the group working on HTTPbis had considered changing the default, but had not managed to reach consensus yet. Is the HTML WG considering updating rfc2854? > (The mention of UTF-8 there is a token gesture; the Web is a legacy > system, so UTF-8 for non-legacy does not apply.) This sounds rather like a subjective statement, which I would be wary of. Of course, the HTML5 spec is here to fix things in a backward- compatible way, but specifications are forward looking, not just back - and checkers are here in part to help move the landscape futureward. Or, at least, so am I told all the time by the likes of timbl :). I also note in the HTML5 specification: “Authors are encouraged to use UTF-8. Conformance checkers may advise against authors using legacy encodings.” So is this a question of a future-looking default (utf8) versus conservative default (win1252)? If so, I would argue that a checker should favor utf8 first, and fallback to win1252 second, no? Thanks. -- olivier
Received on Monday, 28 April 2008 01:43:38 UTC