- From: <bugzilla@jessica.w3.org>
- Date: Fri, 22 Feb 2013 13:18:38 +0000
- To: public-html-bugzilla@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=21088 Bug ID: 21088 Summary: Spec repeats potential Gecko bugs about encoding defaults as the truth Classification: Unclassified Product: HTML WG Version: unspecified Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P2 Component: CR HTML5 spec Assignee: robin@w3.org Reporter: hsivonen@iki.fi QA Contact: public-html-bugzilla@w3.org CC: mike@w3.org Depends on: 21087 See http://www.w3.org/html/wg/drafts/html/CR/syntax.html#determining-the-character-encoding +++ This bug was initially created as a clone of Bug #21087 +++ The spec includes a table of locales and encoding defaults for those locales. The data for that table has been taken from Gecko 1.9.1 source code. It appears that the data hasn't been properly compared with the behavior of IE, which might have more significant market share in some of the locales involved. In particular, it looks suspicious that the Simplified Chinese is GB18030 rather than GBK and every entry that suggests UTF-8 as the encoding looks suspicious. For example, chances are that users of Welsh UI will be exposed to the same legacy content as the users of UK English UI. Also, Windows has a legacy code page specifically for Vietnamese, so it seems incredible that legacy content encountered by users of the Vietnamese locale would more often be UTF-8 that mean that code page. In order to avoid spreading bugs, please remove all the entries that haven't been cross-checked to agree with the defaults of a version of Internet Explorer that predates the inclusion of the table in the spec. If such cross-checking can be performed in a timely manner, please at least remove all the entries that claim that the default should be UTF-8 or GB18030 for the time being. -- You are receiving this mail because: You are the QA Contact for the bug.
Received on Friday, 22 February 2013 13:18:44 UTC