- From: Leif Halvard Silli <lhs@malform.no>
- Date: Fri, 12 Jun 2009 04:48:16 +0200
- To: Ian Hickson <ian@hixie.ch>
- CC: public-html@w3.org
Ian Hickson On 09-06-12 01.16: > On Wed, 3 Jun 2009, Henri Sivonen wrote: >> *Of course* authoring tools >> should use UTF-8 *and declare it* for any new documents. >> >> HTML5 already says: "Authors are encouraged to use UTF-8." >> http://www.whatwg.org/specs/web-apps/current-work/#charset > > I could make this stronger if people think that would be helpful. That would be a good thing. It should say that conforming authoring tools (please do not only say 'authors' in this case) MUST _default_ to using UTF-8. The developers of the (partly) W3 sponsored Amaya editor claim that there are reasons for having ISO-8859-1 as the default charset.[1] (Seems like the concern is that some web servers sets ISO-8859-1 as default for documents with the .html extension .) I have also previously (2007) mentioned on this list that authoring tools, including Web browsers, should have support for character encoding suffixes (file.html.utf8) - such as e.g. Apache has. I think that, when reading file:/// urls, then user agents could use these charset extensions to mimic the charset header of web servers. Thus authors and authoring tools would get a simple way to experience how the HTTP headers have a higher importance than what the META element specifies. Thus, if a file has the name "file.html.utf8", then UAs should, when reading that file via the file URL protocol give precedence to the encoding expressed by the file suffix. Thus, I would suggest that HTML 5 a) specifies the file suffixes for all the encodings that it endorses (building on those that Apache by default uses), b) recommend Web browsers to recognize these suffixes, when reading files via file:// [1] http://lists.w3.org/Archives/Public/www-amaya/2007OctDec/0169 -- leif halvard silli
Received on Friday, 12 June 2009 02:48:55 UTC