- From: Robert Burns <rob@robburns.com>
- Date: Tue, 10 Jul 2007 16:55:42 -0500
- To: HTML Working Group <public-html@w3.org>
On Jul 10, 2007, at 7:21 AM, Simon Pieters wrote: > >>> Perhaps, but it isn't compatible with existing UAs. >> >> Do we already have some tests on this? > > We do now... ;-) > > http://simon.html5.org/test/html/parsing/encoding/001.htm Im not sure what that was supposed to test. It would be helpful if it said something like: "There should be a smiley face below if your browser is using <meta charset=""> to determine encoding over <html charset="">. I tried a slight variation on your test: <!doctype html> <html charset=utf-8 > <head> <meta charset=iso-8859-1 > </head> <body> <p>There should be a white smiley face below if your browser supports @charset in the root html element: <p>☺ </body> </html> When saved as utf-8 with no BOM, Safari displays it as UTF-8. My default for Safari is Latin1. So that's one browser that it is compatible with this approach. That's only one browser tested, but that would just mean we already have one forward-looking HTML5 friendly UA. In any event, the test needs to be done with non-UTF encodings. Otherwise I think browsers might be too smart in detecting UTF encodings. > Even if it didn't complicate implementation, it still isn't > compatible with current UAs, which is the main drawback. I'm here because I'm mostly interested in the forward looking portion of HTML5. If others are not so interested in that, then I understand. However, there are portions of this draft that also are not "compatible with current UAs". So pointing out that drawback is simply pointing out the obvious. Some portions of HTML5 are compatible with existing UAs. Some portions of HTML5 are not. This proposal falls in the latter category. That is also its advantage as I see it. It still follows the criteria of other portions of the draft in that it does not break things. For some period of time, authors would need to make sure their character encodings were consistent on the <meta> and the <html> attributes, but there would come a time in the future — a time when it was interoperable to use <video>, <audio>, <canvas> etc — when an author could simply place a charset attribute on the root element and be done with ti. Ideally, we should tell authors to use a BOM compatible encoding (I'm curious how well that's supported now) and only use a different encoding only if those encodings don't meet their needs (not sure what those needs would be). However, as long as authors feel the need to use other encodings, we should probably try to make it as simple as possible. Take care, Rob
Received on Tuesday, 10 July 2007 21:55:53 UTC