- From: <bugzilla@jessica.w3.org>
- Date: Fri, 22 Feb 2013 13:18:38 +0000
- To: public-html-bugzilla@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=21088
Bug ID: 21088
Summary: Spec repeats potential Gecko bugs about encoding
defaults as the truth
Classification: Unclassified
Product: HTML WG
Version: unspecified
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: CR HTML5 spec
Assignee: robin@w3.org
Reporter: hsivonen@iki.fi
QA Contact: public-html-bugzilla@w3.org
CC: mike@w3.org
Depends on: 21087
See
http://www.w3.org/html/wg/drafts/html/CR/syntax.html#determining-the-character-encoding
+++ This bug was initially created as a clone of Bug #21087 +++
The spec includes a table of locales and encoding defaults for those locales.
The data for that table has been taken from Gecko 1.9.1 source code. It appears
that the data hasn't been properly compared with the behavior of IE, which
might have more significant market share in some of the locales involved. In
particular, it looks suspicious that the Simplified Chinese is GB18030 rather
than GBK and every entry that suggests UTF-8 as the encoding looks suspicious.
For example, chances are that users of Welsh UI will be exposed to the same
legacy content as the users of UK English UI. Also, Windows has a legacy code
page specifically for Vietnamese, so it seems incredible that legacy content
encountered by users of the Vietnamese locale would more often be UTF-8 that
mean that code page.
In order to avoid spreading bugs, please remove all the entries that haven't
been cross-checked to agree with the defaults of a version of Internet Explorer
that predates the inclusion of the table in the spec. If such cross-checking
can be performed in a timely manner, please at least remove all the entries
that claim that the default should be UTF-8 or GB18030 for the time being.
--
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Friday, 22 February 2013 13:18:44 UTC