Re: Internationalized CLASS attributes from Gary Adams - Sun Microsystems Labs BOS on 1996-10-18 (www-international@w3.org from October to December 1996)

From: Gary Adams - Sun Microsystems Labs BOS <gra@zeppo.East.Sun.COM>
Date: Fri, 18 Oct 1996 08:59:16 -0400
To: keld@dkuug.dk, mduerst@ifi.unizh.ch, www-international@w3.org, Chris.Lilley@sophia.inria.fr
Cc: rosenne@NetVision.net.il
Message-Id: <199610181259.IAA20851@zeppo.East.Sun.COM>

After a couple of days on this thread I'm not sure I have
a clear picture of the key issues. I think I heard :

   - ASCII is an unneccessary restriction and not sufficient for 
     covering all the intended uses of CLASS names. Few people
     seriously consider this as a viable long term option.
     
   - compressing case (toupper or tolower) from CLASS names might 
     provide some end user ease of use features, if a consistent 
     algorithm can be used on all platforms to get identical 
     results. Some people still consider this a worth while 
     option, but many believe it to be a counter productive proposal.
     
   - in addition to case, many western languages could also benefit
     from composition/decomposition of accented characters when
     matching CLASS names. Many people believe that there are real
     benefits to a canonical representation that has learned from the
     mistakes of past proprietary or national encoding standards.
     
   - it's clear that the canonical representation of names cannot
     rely on the character input method or the native platform
     character encoding. Section 5.15 in the Unicode 2.0 standard
     provides a good look at the searching/sorting problems that users
     will be faced with.
     
At the start of this discussion, I did not think users would actually
be entering CLASS names manually, but that a selection system would be 
provided by applications to select/create templates. Ultimately a class
is selected/defined because it represents a particular set of behaviours
or it is an anchor point (e.g. a name) to which the behaviours can be 
attached. Going beyond the simple case of I18N/L10N monolingual environments
a user may be faced with selecting classes and libraries of classes that
have internally consistent names and that require additional mechanisms
for use by non-native speakers to locate and identify them. e.g. aliases,
translations, transliterations, etc.

During the authoring process, I can think of many good reasons to be forgiving
in the matching algorithms dealing with, case, accents, tones, etc. I'd 
even consider synonyms and typographically correctable errors in providing
users with a lists of classes that meet a particular criteria, but in a
runtime system which could be fetching a style sheet definition from 
a remote database of styles, I'd prefer to treat the CLASS name string as 
an exact database key for simple string matching purposes. e.g. the user of
a class is presented the string that the author of the class originally typed
or the database encoded as part of it's schema processing.

Received on Friday, 18 October 1996 09:00:44 UTC