A string is a sequence of characters

This is a last call comment from Björn Höhrmann (bjoern@hoehrmann.de) on
the Character Model for the World Wide Web 1.0
(http://www.w3.org/TR/2002/WD-charmod-20020430/).

Semi-structured version of the comment:

Submitted by: Björn Höhrmann (bjoern@hoehrmann.de)
Submitted on behalf of (maybe empty): 
Comment type: other
Chapter/section the comment applies to: 6.1 String concepts
The comment will be visible to: public
Comment title: A string is a sequence of characters
Comment:
I think section 6.1 "String concepts" is flawed. A "string" is a sequence of characters, to have more notions of a string is confusing and does not help to understand the issues involved. The actual wording is confusing too, for example

[...]
  Byte string: A string viewed as a sequence of bytes representing
  characters in a particular character encoding. This corresponds to a
  CES.
[...]

This suggests that a byte string is a character encoding scheme. In fact, I do not quite understand the difference between a code unit string and a byte string, a byte string appears to be an instance of a code unit string.


Structured version of  the comment:

<lc-comment
  visibility="public" status="pending"
  decision="pending" impact="pending" id="LC-">
  <originator email="bjoern@hoehrmann.de"
      >Björn Höhrmann</originator>
  <represents email=""
      >-</represents>
  <charmod-section href='http://www.w3.org/TR/2004/WD-charmod-20040225/#sec-Strings'
    >6.1</charmod-section>
  <title>A string is a sequence of characters</title>
  <description>
    <comment>
      <dated-link date="2004-04-08"
         href="http://www.w3.org/mid/497720050.20040408215331@toro.w3.mag.keio.ac.jp"
        >A string is a sequence of characters</dated-link>
      <para>I think section 6.1 &#x22;String concepts&#x22; is flawed. A &#x22;string&#x22; is a sequence of characters, to have more notions of a string is confusing and does not help to understand the issues involved. The actual wording is confusing too, for example

[...]
  Byte string: A string viewed as a sequence of bytes representing
  characters in a particular character encoding. This corresponds to a
  CES.
[...]

This suggests that a byte string is a character encoding scheme. In fact, I do not quite understand the difference between a code unit string and a byte string, a byte string appears to be an instance of a code unit string.</para>
    </comment>
  </description>
</lc-comment>

Received on Thursday, 8 April 2004 17:53:33 UTC