- From: <bugzilla@jessica.w3.org>
- Date: Sun, 11 Dec 2011 16:53:18 +0000
- To: public-html-bugzilla@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=15142 --- Comment #7 from Julian Reschke <julian.reschke@gmx.de> 2011-12-11 16:53:18 UTC --- (In reply to comment #5) > > Point taken, but not convinced. For all practical purposes, UTF-8 is defined by > > RFC 3629. That's where people look. Also, RFC 3629 doesn't even link to another > > definition. So where is the definition by the Unicode consortium, and why isn't > > it referenced? > > Did you read the first paragraph in RFC 3629 Section 3 [1] (which I quoted > above)? Yes, "Unicode" is mentioned, but there's no reference that takes me to the actual definition. In the meantime I noticed that UTF-8 is indeed defined in <http://unicode.org/versions/Unicode5.2.0/ch03.pdf>, and I believe it would be good to add an erratum to RFC 3629 pointing out that a revision should actually *reference* the Unicode definition. > > Also, a more general point: I would hope that all future definitions of > > character encoding schemes in the IANA registry are based on the Unicode code > > points, even those which can not represent all code points. The procedure for > > IANA charset registrations is in IETF BCP 19, which doesn't even mention > > Unicode, as far as I can tell. > > Different national administrations have different priorities. There will always > remain character encodings not based on the Unicode Character Set, for legacy > reasons if no others. > > The Unicode Consortium does not maintain a character encoding scheme registry. > IANA does. However, the Unicode Consortium does own the term "UNICODE", so if > someone wishes to register this term as a charset value, they need to take it > up with the Unicode Consortium, and not with the HTML WG. But I would suggest > they would be wasting their time, since it is extremely unlikely the Unicode > Consortium would choose to enter such registration (for some of the reasons I > have cited as well as others). I agree that HTML is the wrong place to start. The registry is maintained by IANA, and how to get values into the registries is defined by an IETF BCP. I don't see a requirement to go through the Unicode Consortium. That being said, I do agree that using the string "Unicode" as character encoding scheme name is a bad idea. I'm not sure about "ownership" of names though, if IANA would need to reject any registration for a "charset" name where somebody claims to "own" the name, the whole process might get very complicated :-). -- Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Sunday, 11 December 2011 16:53:21 UTC