Re: Your comments on the Character Model [C150, C151, C152, C153, C154, C156, C157, C160] from C. M. Sperberg-McQueen on 2003-02-13 (www-i18n-comments@w3.org from February 2003)

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Wed, 12 Feb 2003 18:01:30 -0700
To: Richard Ishida <ishida@w3.org>
Cc: "'Martin Duerst'" <duerst@w3.org>, www-i18n-comments@w3.org
Message-Id: <5.1.0.14.1.20030212164214.03740da0@localhost>
At 2003-02-12 08:05, Richard Ishida wrote:
 > Dear Michael,

 > Many thanks for your comments on the 2nd Last Call version of the
 > Character Model for the World Wide Web v1.0 [1].  We appreciate the
 > interest you have taken in this specification.

 > You can see the comments you submitted, grouped together, at
 > http://www.w3.org/International/Group/2002/charmod-lc/SortByOriginator.h
 > tml#C150 (You can jump to a specific comment in the table by adding
 > its ID to the end of the URI.)

 > The following comments were accepted and implemented as you
 > suggested: C152, C153, C154, C156, C157, C160

Many thanks.

 >  Please review the decisions for the following additional comments
 > and reply to us within the next two weeks at
 > www-i18n-comments@w3.org to say whether you are satisfied with the
 > decision taken.  C150, C151

 > Information relating to these comments is included below.
...
 > *****C150 E R C C. M. Sperberg-McQueen

 >  P RI Various [914]Go to Index The term "UCS" vs. the term "Unicode"

 >    * Comment (received 2002-07-12) -- [915]The term "UCS" vs. the
 >      term "Unicode" Sec. 1.1 says, inter alia, "In this document,
 >      Unicode is used as a synonym for the Universal Character Set."
 >      I believe the term "UCS" would be better, because it is clearer
 >      and less subject to misconstruction.
 >
 >      It is clearer because the term "Unicode" may reasonably be used
 >      to denote (a) the consortium of that name, (b) the Univeral
 >      Character Set defined by ISO/IEC 10646 and by the Unicode
 >      Standard, (c) the UCS taken together with the additional rules
 >      defined by the Unicode Standard, which Unicode does NOT share
 >      with ISO/IEC 10646, and (d) the Unicode Standard
 >      itself. Despite the explicit statement that in the character
 >      model spec the term "Unicode" is used in sense (b), I suspect
 >      the common use, elsewhere, of the term in senses (a), (d), and
 >      especially (c), will necessarily color readers' perceptions of
 >      the meaning of the text.
 >
 >      The term "UCS" is also less likely to convey to casual readers
 >      that it is really the Unicode Standard, not ISO/IEC 10646,
 >      which counts. It is true, as you have pointed out from time to
 >      time, that the Unicode Consortium and the responsible ISO/IEC
 >      technical committee have worked well for some time now in
 >      keeping the two standards aligned. I applaud that fact and the
 >      role some of you have individually played in making it
 >      happen. But I remember too the years in which the two
 >      organizations threatened to burden the world with two different
 >      and incompatible universal character sets, and the roles some
 >      of you played then, and I am unwilling that any W3C
 >      specification should risk conveying the idea that if the two
 >      standards should diverge, the Web or the W3C would naturally
 >      side with one or the other party.
 >
 >      It would not be appropriate to use the term "ISO/IEC 10646" (or
 >      just "10646" for short) to refer to the UCS. It is also not
 >      appropriate to use the term "Unicode".
 >
 >      Please reconsider and use the neutral and unambiguous term "UCS".
 >
 >    * Decision: Rejected.
 >
 >    * Rationale for "Rejected": The word "Unicode" is almost
 >      universally used in this sense, including by Production 2 of
 >      the XML specification.
 >
 >    * Decision: Review all instances of the word "Unicode", to ensure
 >      they are used consistently.
 >
 >   [915] 
http://lists.w3.org/Archives/Public/www-i18n-comments/2002Jul/0010.html
 >   [916] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-Transcoding

I thank you for considering this comment, and for pointing out the
error in production 2 of the XML specification; I will file a CR
comment with the XML Core WG asking that the reference (which is
clearly an editorial error on my part) be corrected.

I regret to report that I find your rationale unconvincing.  If, as
you say in the response to C151, references to international standards
should invariably be preferred to references to corresponding national
standards, then how much more strongly ought specifications defined by
private industry consortia to be deprecated in favor of international
standards which define exactly the same technical content.  (And since
you say that the character model spec uses the term "Unicode" only to
refer to the UCS, the technical content must necessarily be exactly
the same.  I am a little surprised, since I thought that any
discussion of Unicode normalization must necessarily go beyond the
definition of the UCS, precisely into material specified by Unicode
but not by ISO 10646, and that use of the term 'Unicode' is
appropriate in that context.  But if you say Unicode normalization is
part of the definition of the UCS, I am not in a position to prove
otherwise.)

Please record my formal dissent from your decision.


*****C151 E P C C. M. Sperberg-McQueen
    -
 >  P RI [918]A.2 [919]Go to Index ANSI X3.4 is missing
 >    * Comment (received 2002-07-12) -- [920]ANSI X3.4 is missing
 >      The spec refers several times to ASCII. In the context of a
 >      specification defining a character model, I assume that this term
 >      is used in its proper and narrow sense to denote the coded
 >      character set defined by American national standard ANSI X3.4.
 >      That American national standard should be included among the
 >      non-normative references.
 >    * Decision: Partially accepted.
 >    * Decision: Cite ISO 646 (International Reference Version), rather
 >      than ANSI X3.4, and link to it from the text.
 >    * Rationale for "Partially accepted": Where a national and an
 >      international standard define the same matter, use of the latter
 >      is preferable.

 >   [918] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-OtherReferences
 >   [920] 
http://lists.w3.org/Archives/Public/www-i18n-comments/2002Jul/0011.html

This would be an acceptable response if you had also removed the
references to "ASCII characters" and the like from the text of the
specification and replaced them with references to "ISO 646 IRV"
characters.  But the point of my comment was not solely that
references to specifications in the body of the text ought to be
accompanied by corresponding bibliographic information in the back
matter.  The term 'ASCII' is an acronym for the name of a specific
standard, the American Standard Code for Information Interchange,
published under that and several other titles from time to time
beginning in 1962.  If it is necessary to use the term 'ASCII', then
it seems to me that it would be a courtesy to your readers to explain
the acronym (this is a rule some authorities strongly recommend for
all acronyms), and a courtesy to those who developed the standard (as
well as to your readers) to provide a reference to the standard
itself.

As noted above, I believe that if you take your rationale for C151
seriously, it ought to compel a different decision on C150 (and,
indeed, the suppression of any mention of the Unicode Consortium).
This observation leads me to suspect that you do not, in fact, take it
very seriously.

The only way I can interpret your response, and the note attached to
the bibliographic reference to ISO 646, is that the mention of the
American National Standards Institute has for reasons I do not
understand become taboo.  One gets the impression that you believe it
somehow indelicate to refer, in a specification concerned with
internationalization, to a national standard, and embarrassing that
common usage should refer to a particular national standard, when it
ought, really, to refer to the corresponding international one.  I
think you should overcome your fear of indelicacy and break the taboo.

I don't object to your mentioning that ASCII is, formally (not, as far
as I can tell, historically) simply a national version of ISO 646.
But I do think it a discourtesy to your readers to use the term
'ASCII' without giving a coherent and historically accurate account of
the meaning and origin of the acronym.

[I notice now that it would similarly be useful to provide references
to the two DIN standards which specify sorting of German for names and
for other applications, and for standards which specify the various
other behaviors described in the examples in 3.1.5, where such
standards exist. I should have mentioned that before, sorry.]

Please record my formal dissent from your decision.

................

I have reviewed the changes made in connection with my comments C152
through C158 and found nothing problematic except a typo: in 3.1.5,
for 'accomodate' read 'accommodate'.

I thank you for your review of my comments, for accepting so many of
them, and for the useful display of the comments and their resolutions
in your admirable table
(http://www.w3.org/International/Group/2002/charmod-lc/).  I am sorry
not to be able to agree with you on these two remaining items.

I wish you all success with the specification.

-C. M. Sperberg-McQueen
Received on Wednesday, 12 February 2003 22:29:39 UTC