W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > December 2011

[Bug 15142] Define "UNICODE" as a defacto alias for "UTF-16"

From: <bugzilla@jessica.w3.org>
Date: Sun, 11 Dec 2011 22:06:24 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1RZrXM-00084B-UD@jessica.w3.org>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15142

Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xn--mlform-iua@xn--mlform-i
                   |                            |ua.no

--- Comment #8 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 2011-12-11 22:06:23 UTC ---
(In reply to comment #2)
> (In reply to comment #1)
> > this proposal should be rejected for a variety of reasons:
> 
> I agree with this conclusion, but...

Regarding "this proposal", then this bugzilla report could be said to include 3
proposals:

(1) The main proposal is to require the HTML5 parser to, when it see
charset="UNICODE" (upper- or lowercase), replace it with charset="UTF-16"
(which in turns gets replaced with "UTF-8" it occurs inside a HTML document).
This in order to a) be compatible with "the Web", b) to support the shift to
Unicode in particular and UTF-8 especially by c) making sure that content that
is intended to be unicode, is treated as unicode by all HTML5 user agents.

(2) Secondly, it suggests that charset="UNICODE" should be non-conforming in
HTML5 documents - authors should be allowed to use it. This in fact goes
without saying, as it is even forbidden, per HTML5, to use <meta
charset="UTF-16" > in a HTML document.

(3) Finally I took up whether the alias should be formally registered. However,
 I suppose that even if it became formally registered, the recommended name of
this encoding would remain "UTF-16".  For instance, Validator.nu whines if you
use <meta charset="ANSI_X3.4-1968"> instead of <meta charset="US-ASCII"> as it
is only the latter that is a recommended encoding name. I would expect the same
behaviour for <meta charset="UNICODE">, regardless of whether it became
registered.

QUESTIONS: Which of these 3 proposals are you disagreeing with?  And what are
the pros and cons of registering? Julian, is it only that this is "the wrong
place" that is the problem for you? Glenn, why have the UNICODE consortium
been, quietly, looking at the psread of "UNICODE" as unofficial alias for
"UTF-16"? 

W.r.t. to registering, here are some thoughts:One reason to *not* register
"UNICODE" is the fact that it isn't supposed to be conforming anyway. However,
this doesn't seem particulary strong, as it would most certainly nevertheless
be non-conforming  to use it.

I am very willing to send an e-mail to the right authority to ask them to
consder whether "UNICODE" should become an alias (not-recommended bust still
alias) for "UTF-16". It is sofar unclear to me who to contact though.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Sunday, 11 December 2011 22:06:26 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:02:10 UTC