- From: Martin Duerst <duerst@w3.org>
- Date: Wed, 05 Jun 2002 14:08:47 +0900
- To: jim.melton@acm.org (Jim Melton), www-i18n-comments@w3.org
- Cc: w3c-i18n-ig@w3.org, w3c-xml-query-wg@w3.org
Hello Jim, dear XML Query WG, We discussed this comment of your at our teleconference yesterday, and I was actioned to convey our decision to you. At 18:39 02/05/31 +0900, Jim Melton wrote: >This is a last call comment from Jim Melton (jim.melton@acm.org) on >the Character Model for the World Wide Web 1.0 >(http://www.w3.org/TR/2002/WD-charmod-20020430/). > >Semi-structured version of the comment: > >Submitted by: Jim Melton (jim.melton@acm.org) >Submitted on behalf of (maybe empty): W3C XML Query Working Group >Comment type: editorial >Chapter/section the comment applies to: 3.2 Digital Encoding of Characters >The comment will be visible to: public >Comment title: proprietary charset identifiers >Comment: >Section 3.2, "Digital Encoding of Characters", list element 4, contains >the phrase "... is identified by an IANA charset identifier." > >In fact, there are a great many CESes that are identified by charset >identifiers that are not assigned by IANA at all, but that are "created" >by proprietary means (e.g., corporations). The Character Model >specification must not prohibit the use of CESes identified by charset >identifiers assigned through other means. > >To correct this, simply change "...is identified by an IANA charset >identifier." to "...is identified by a unique identifier, such as an IANA >charset identifier." However, working on the details today, I discovered that it may be better to request a clarification from you first. You request that section 3.2 mentions other identifiers for character encodings than those registered by IANA. But Section 3.2 just mentions the labels as part of the overall model. Details of what encodings to use or not to use, and what labels to use for them, are given in Section 3.6.2 (http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-EncodingIdent). Section 3.6.2 also has a very strong emphasis on IANA labels, because using labels from a single registry is the only way to avoid conflicts, and the IANA registry is the registry used on the Internet (and the Web is part of the Internet). Given this, can you please clarify whether the Query WG meant that: a) changing "...is identified by an IANA charset identifier." to "...is identified by a unique identifier, such as an IANA charset identifier." is appropriate in Section 3.2 because this is a general discussion, and any set of unique identifiers could do, and specifics are discussed in 3.6.2. b) The change was intended to make sure that encoding identifiers other than those registered with IANA would conform to the character model; Section 3.6.2 would have to be changed, too. Yesterday, we forgot about 3.6.2, but assumed the intent of b). If b) is your intent, please find our answer below. If your intent was a), or something else, we will have to reconsider your comment. <assumption value='b)'> First, please note that your classification of this comment was 'editorial', but we have decided to reclassify it as 'substantial'. Second, we have decided to reject this comment, based on the following reasons: - IANA charset identifiers (except for those starting with x-) are guaranteed to be unique. Adding any other set(s) of identifiers to the IANA identifiers very quickly removes this guarantee. Because of that, your proposed change can either be seen as an unnecessary addition, putting in more words but, under careful analysis, not saying anything different, or it can be misunderstood by readers to guarantee some uniqueness when indeed such a guarantee is not possible. [If you know about some trick to guarantee uniqueness among different sets of identifiers, then we sure would like to know.] - IANA does not 'assign' identifiers, it just registers them. Anybody can apply for registration. A few years ago, there has been a tendency to restrict registration to widely used/usable encodings, but this lead to the defacto use of many unregistered encodings with an x- prefix. Registration practice has changed to be very liberal now, while making sure that each registration notes duly whether the encoding in practice is suitable for the use on the Internet at large. If any corporation represented in the XML Query WG or elsewhere uses encodings that are not registered with IANA, we strongly recommend to register them. - The IANA registry already contains registrations for many (some even say too many) proprietary encodings. Indeed, the majority of encodings registered are proprietary encodings rather than encodings defined by standards organizations. There is quite some chance that your encoding is already registered. Please check. - The IANA registry already contains many (some even say too many) aliases for most encodings. There is quite some chance that the identifier used inside your corporation is already an alias. Please tell us, at your earliest convenience, whether you are satisfied with our decision or not. If not, please provide additional rationale. </assumption> If there are any questions or comments, please don't hesitate to contact us again. Regards, Martin.
Received on Wednesday, 5 June 2002 01:09:10 UTC