- From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
- Date: Fri, 12 Jul 2002 10:19 +0900
- To: www-i18n-comments@w3.org
- Cc: cmsmcq@acm.org (C. M. Sperberg-McQueen)
This is a last call comment from C. M. Sperberg-McQueen (cmsmcq@acm.org) on the Character Model for the World Wide Web 1.0 (http://www.w3.org/TR/2002/WD-charmod-20020430/). Semi-structured version of the comment: Submitted by: C. M. Sperberg-McQueen (cmsmcq@acm.org) Submitted on behalf of (maybe empty): Comment type: substantive Chapter/section the comment applies to: 3.1.5 Units of collation The comment will be visible to: public Comment title: Spanish 'ch' is not a letter sequence Comment: Section 3.1.5 says "EXAMPLE: In traditional Spanish sorting, the letter sequences 'ch' and 'll' are treated as atomic collation units. Although Spanish sorting, and to some extent Spanish everyday use, treat 'ch' as a single unit, current digital encodings treat it as two letters, and keyboards do the same (the user types 'c', then 'h')." This is not what I learned in grade school. Sra. Robles was quite clear, and rather strict about it (and so of course I am sure that she is right and your informants must be wrong). I believe the paragraph would be more accurate and clearer if it read "EXAMPLE: In traditional Spanish sorting, the character sequences 'ch' and 'll' are treated as single letters and as atomic collation units. Although Spanish sorting, and to some extent Spanish everyday use, treat 'ch' as a single unit, current digital encodings treat it as two characters, and keyboards do the same (the user types 'c', then 'h')." I don't know of any digital encoding whose specification provides any definition of "letter", and thus I find it surprising and confusing to read that most such encodings treat "ch" as two letters: I don't believe that any character set specifications or encodings can meaningfully be said to treat ANYTHING as ANY number of "letters", since "letter" is a concept foreign to their universe of discourse. Structured version of the comment: <lc-comment visibility="public" status="pending" decision="pending" impact="substantive"> <originator email="cmsmcq@acm.org" represents="-" >C. M. Sperberg-McQueen</originator> <charmod-section href='http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-CollationUnits' >3.1.5</charmod-section> <title>Spanish 'ch' is not a letter sequence</title> <description> <comment> <dated-link date="2002-07-12" >Spanish 'ch' is not a letter sequence</dated-link> <para>Section 3.1.5 says "EXAMPLE: In traditional Spanish sorting, the letter sequences 'ch' and 'll' are treated as atomic collation units. Although Spanish sorting, and to some extent Spanish everyday use, treat 'ch' as a single unit, current digital encodings treat it as two letters, and keyboards do the same (the user types 'c', then 'h')." This is not what I learned in grade school. Sra. Robles was quite clear, and rather strict about it (and so of course I am sure that she is right and your informants must be wrong). I believe the paragraph would be more accurate and clearer if it read "EXAMPLE: In traditional Spanish sorting, the character sequences 'ch' and 'll' are treated as single letters and as atomic collation units. Although Spanish sorting, and to some extent Spanish everyday use, treat 'ch' as a single unit, current digital encodings treat it as two characters, and keyboards do the same (the user types 'c', then 'h')." I don't know of any digital encoding whose specification provides any definition of "letter", and thus I find it surprising and confusing to read that most such encodings treat "ch" as two letters: I don't believe that any character set specifications or encodings can meaningfully be said to treat ANYTHING as ANY number of "letters", since "letter" is a concept foreign to their universe of discourse. </para> </comment> </description> </lc-comment>
Received on Thursday, 11 July 2002 21:19:43 UTC