Counting languages from C. M. Sperberg-McQueen on 2002-07-12 (www-i18n-comments@w3.org from July 2002)

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Fri, 12 Jul 2002 10:28 +0900
To: www-i18n-comments@w3.org
Cc: cmsmcq@acm.org (C. M. Sperberg-McQueen)
Message-Id: <20020712012805.047D8189C@toro.w3.mag.keio.ac.jp>

This is a last call comment from C. M. Sperberg-McQueen (cmsmcq@acm.org) on
the Character Model for the World Wide Web 1.0
(http://www.w3.org/TR/2002/WD-charmod-20020430/).

Semi-structured version of the comment:

Submitted by: C. M. Sperberg-McQueen (cmsmcq@acm.org)
Submitted on behalf of (maybe empty): 
Comment type: editorial
Chapter/section the comment applies to: 3.1.5 Units of collation
The comment will be visible to: public
Comment title: Counting languages 
Comment:
The paragraph which reads "EXAMPLE: In most languages, the letter 'æ' is sorted as two consecutive collation units: 'a' and 'e'" [I wonder if
that aesc will come through this HTML form ...] might be improved if
the "most languages" were changed to "some languages".  In Old English,
Old Norse, Norwegian, Danish, and Swedish, I believe that aesc is treated
as a single collation unit; I don't know of any languages in which
aesc occurs in native words which sorts it in the way you describe.
Are you counting all the other languages in Western Europe as languages
in which aesc is sorted as "ae"?  (Note that it does not matter whether
the languages which sort aesc as "ae" outnumber the others or not: the
point to be made is that they exist.  The term "most" brings in an element
of quantitative comparison which is distracting -- do Flemish and Dutch
count as one language, or two, in this tally? -- and unnecessary. Hence
my suggestion to eliminate "most".) 


Structured version of  the comment:

<lc-comment
  visibility="public" status="pending"
  decision="pending" impact="editorial">
  <originator email="cmsmcq@acm.org" represents="-"
      >C. M. Sperberg-McQueen</originator>
  <charmod-section href='http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-CollationUnits'
    >3.1.5</charmod-section>
  <title>Counting languages </title>
  <description>
    <comment>
      <dated-link date="2002-07-12"
        >Counting languages </dated-link>
      <para>The paragraph which reads "EXAMPLE: In most languages, the letter 'æ' is sorted as two consecutive collation units: 'a' and 'e'" [I wonder if
that aesc will come through this HTML form ...] might be improved if
the "most languages" were changed to "some languages".  In Old English,
Old Norse, Norwegian, Danish, and Swedish, I believe that aesc is treated
as a single collation unit; I don't know of any languages in which
aesc occurs in native words which sorts it in the way you describe.
Are you counting all the other languages in Western Europe as languages
in which aesc is sorted as "ae"?  (Note that it does not matter whether
the languages which sort aesc as "ae" outnumber the others or not: the
point to be made is that they exist.  The term "most" brings in an element
of quantitative comparison which is distracting -- do Flemish and Dutch
count as one language, or two, in this tally? -- and unnecessary. Hence
my suggestion to eliminate "most".) </para>
    </comment>
  </description>
</lc-comment>

Received on Thursday, 11 July 2002 21:28:09 UTC