[Bug 15142] Define "UNICODE" as a defacto alias for "UTF-16" from bugzilla@jessica.w3.org on 2011-12-11 (public-html-bugzilla@w3.org from December 2011)

From: <bugzilla@jessica.w3.org>
Date: Sun, 11 Dec 2011 16:18:28 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1RZm6e-0007aR-RP@jessica.w3.org>

https://www.w3.org/Bugs/Public/show_bug.cgi?id=15142

--- Comment #4 from Julian Reschke <julian.reschke@gmx.de> 2011-12-11 16:18:28 UTC ---
(In reply to comment #3)
> a historical circumstance... in the 1992-93 time frame, ISO SC2/WG2 first
> proposed UTF-1 as a transformation encoding of ISO/IEC 10646 UCS-4;  although
> UTF-1 never caught on, the more efficient alternative, UTF-8, came out of work
> started at X/Open and concluded at Bell Labs in Plan 9;
> 
> later, the Unicode Standard incorporated the normative definition of UTF-8 into
> The Unicode Standard;
> the current IETF RFC 3629 (STD 63) [1] refers to the Unicode Standard for the
> formal definition of UTF-8:
> 
> 3.  UTF-8 definition
> 
>    UTF-8 is defined by the Unicode Standard [UNICODE].  Descriptions and
>    formulae can also be found in Annex D of ISO/IEC 10646-1 [ISO.10646]
> 
> [1] http://tools.ietf.org/html/rfc3629#section-3
> 
> glenn

Point taken, but not convinced. For all practical purposes, UTF-8 is defined by
RFC 3629. That's where people look. Also, RFC 3629 doesn't even link to another
definition. So where is the definition by the Unicode consortium, and why isn't
it referenced?

Also, a more general point: I would hope that all future definitions of
character encoding schemes in the IANA registry are based on the Unicode code
points, even those which can not represent all code points. The procedure for
IANA charset registrations is in IETF BCP 19, which doesn't even mention
Unicode, as far as I can tell.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Sunday, 11 December 2011 16:18:31 UTC