- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Mon, 5 Dec 2011 19:55:43 +0100
>> (And HTML5 defines it the same.) > > No. As far as I understand, HTML5 defines US-ASCII to be the default and > requires that any other encoding is explicitly declared. I do like this > approach. We are here discussing the default *user agent behaviour* - we are not specifically discussing how web pages should be authored. For use agents, then please be aware that HTML5 maintains a table over 'Suggested default encoding': http://dev.w3.org/html5/spec/parsing.html#determining-the-character-encoding When you say 'requires': Of course, HTML5 recommends that you declare the encoding (via HTTP/higher protocol, via the BOM 'sideshow' or via <meta charset=UTF-8>). I just now also discovered that Validator.nu issues an error message if it does not find any of of those *and* the document contains non-ASCII. (I don't know, however, whether this error message is just something Henri added at his own discretion - it would be nice to have it literally in the spec too.) (The problem is of course that many English pages expect the whole "Unicode alphabet" even if they only contain US-ASCII from the start.) HTML5 says that validators *may* issue a warning if UTF-8 is *not* the encoding. But so far, validator.nu has not picked that up. > We should also lobby for authoring tools (as recommended by HTML5) to > default their output to UTF-8 and make sure the encoding is declared. HTML5 already says: "Authoring tools should default to using UTF-8 for newly-created documents. [RFC3629]" http://dev.w3.org/html5/spec/semantics.html#charset > As > so many pages, supposedly (I have not researched this), use the incorrect > encoding, it makes no sense to try to clean this mess by messing with > existing defaults. It may fix some pages and break others. Browsers have > the ability to override an incorrect encoding and this a reasonable > workaround. Do you use a English locale computer? If you do, without being a native English speaker, then you are some kind of geek ... Why can't you work around the troubles -as you are used to anyway? Starting a switch to UTF-8 as the default UA encoding for English locale users should *only* affect how English locale users experience languages which *both* need non-ASCII *and* historically have been using Windows-1252 as the default encoding *and* which additionally do not include any encoding declaration. -- Leif Halvard Silli
Received on Monday, 5 December 2011 10:55:43 UTC