[whatwg] A comment to character encoding declaration

Hi,

It's a comment to the "character encoding declaration"
section of HTML 5 spec:

http://www.w3.org/html/wg/html5/#character1

During the development of CJK information processing, many
text encodings is just a strict subset of another one, for
example, GB2312 is a subset of GBK, GBK is a subset of
GB18030. For compatibility purpose, a lot of web pages used
character encoding declaration like this:

<meta http-equiv="Content-Type" content="text/html; charset=gb2312">

in their header, yet they might use characters in GBK but
not in GB2312. So, I think we can suggest clients to simply
treat encodings like these as their biggest superset, for
instance, treat GB2312 as GB18030.

BTW, browsers like Firefox seems already handles such cases
well, but Safari/WebKit seems not.

Regards,
Jiang

Received on Monday, 3 March 2008 07:11:02 UTC