- From: Dave Peterson <davep@acm.org>
- Date: Sat, 14 Jun 1997 20:38:28 -0400
- To: <w3c-sgml-wg@w3.org>
At 12:32 PM 6/14/97, James Clark wrote:
>I'm not sure I agree with Gavin when he says that all that is needed is a
>String type. I think you need a Character type as well.
Generally right, for several reasons.
o Many non-canonical string representations represent characters
within a string in a context-dependent manner. A single character
has no context.
o Many non-canonical string representations represent various
characters with different-sized bit patterns. It's usually
customary to represent single characters with bit patterns
all of the same length.
o Some string representations cannot represent every character.
For example, if you consider 0 (all 0 bits) in ASCII to represent
a character, then the usual C character-string representation
cannot represent a string that includes that character.
o By comparison: An array of all-less-than-two-to-the-n nonnegative
integers is canonically represented by a long bit combination that
is the result of directly concatenating the representations of
each integer in the array. A string/array of characters is
canonically represented by a similar long bit combination. Having
a separate character class, rather than treating a character as
a string/array of length one is the same as having a separate
non-negative integer class rather than treating a non-negative
integer as an array of length one. The latter is in both cases
more of a nuisance than the former.
For many reasons, a character class separate from the related character-
string class is useful.
Dave Peterson
SGMLWorks!
davep@acm.org
Received on Saturday, 14 June 1997 20:38:43 UTC