W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > June 1997

Re: I18N issue needs consideration

From: Dave Peterson <davep@acm.org>
Date: Sat, 14 Jun 1997 20:38:28 -0400
Message-Id: <v01540b02afc8ca444400@[]>
To: <w3c-sgml-wg@w3.org>
At 12:32 PM 6/14/97, James Clark wrote:

>I'm not sure I agree with Gavin when he says that all that is needed is a
>String type.  I think you need a Character type as well.

Generally right, for several reasons.

    o   Many non-canonical string representations represent characters
        within a string in a context-dependent manner.  A single character
        has no context.

    o   Many non-canonical string representations represent various
        characters with different-sized bit patterns.  It's usually
        customary to represent single characters with bit patterns
        all of the same length.

    o   Some string representations cannot represent every character.
        For example, if you consider 0 (all 0 bits) in ASCII to represent
        a character, then the usual C character-string representation
        cannot represent a string that includes that character.

    o   By comparison:  An array of all-less-than-two-to-the-n nonnegative
        integers is canonically represented by a long bit combination that
        is the result of directly concatenating the representations of
        each integer in the array.  A string/array of characters is
        canonically represented by a similar long bit combination.  Having
        a separate character class, rather than treating a character as
        a string/array of length one is the same as having a separate
        non-negative integer class rather than treating a non-negative
        integer as an array of length one.  The latter is in both cases
        more of a nuisance than the former.

For many reasons, a character class separate from the related character-
string class is useful.

Dave Peterson

Received on Saturday, 14 June 1997 20:38:43 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:25:10 UTC