W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > November 2010

[Bug 11423] Character sets not registered with IANA

From: <bugzilla@jessica.w3.org>
Date: Mon, 29 Nov 2010 16:22:31 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1PN6Up-0005AI-QV@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11423

--- Comment #7 from Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com> 2010-11-29 16:22:31 UTC ---
(In reply to comment #5)
> If I need to know about a character set, I look there first,
> and so will pretty much every implementer.

Anne's provided significant evidence to the contrary.

> > If the spec simply defined the preferred name of Windows-949 as
> > (case-insensitive) "Windows-949", could we close this bug?
> 
> Nope.  I would be happy with (a) windows-949 being registered with IANA, or (b)
> windows-949 not being mentioned at all.  An additional alternative, which is
> not at all preferable, is a note in the text to the effect of "The HTML5
> Working Group has deliberately chosen to refer to and favor over other,
> better-specified alternatives (e.g. EUC-KR) the character set 'windows-949',
> even though it is not registered properly with IANA."

This sounds greatly preferable to me, as we need windows-949 for legacy
content, key implementers like Opera don't care about its registration, and
nobody who does care about its registration is keen to register it.

As far as I can tell, we've already got a note in the text to the effect of
your note:

"The requirement to treat certain encodings as other encodings according to the
table above is a willful violation of the W3C Character Model specification,
motivated by a desire for compatibility with legacy content. [CHARMOD]"

http://dev.w3.org/html5/spec/parsing.html#determining-the-character-encoding

The referenced document:

http://www.w3.org/TR/charmod/

describes how content should be interpreted according to its declared IANA
character set.

What's the practical difference between your suggested additional alternative
and the text we already have?

By which I mean: what would your text cause implementors to do differently and
why?

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Monday, 29 November 2010 16:22:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 29 November 2010 16:22:41 GMT