Re: Your comments on the Character Model [C068-C072, C079] from Martin Duerst on 2004-02-06 (www-tag@w3.org from February 2004)

From: Martin Duerst <duerst@w3.org>
Date: Fri, 06 Feb 2004 18:02:16 -0500
To: Tim Bray <tbray@textuality.com>, "Richard Ishida" <ishida@w3.org>
Cc: www-tag@w3.org <www-tag@w3.org>, w3c-i18n-ig@w3.org, www-i18n-comments@w3.org
Message-Id: <4.2.0.58.J.20040126154038.079bded0@localhost>

At 10:42 04/01/24 -0800, Tim Bray wrote:

>C071:
>Not satisfied; see
>http://www.gbiv.com/protocols/uri/rev-2002/draft-fielding-uri-rfc2396bis-03. 
>html#comparison-string
>
>The point is that the phrase 'bit-for-bit' is misleading.  It's
>code-point-by-code-point; how these are encoded into bits is a red
>herring.

I think we are not too far apart. How codepoints are encoded into
bits is not a red herring, but an important issue. The citation
that you give above gives some examples. For people to understand
that it's important that they compare characters, it's important
to talk about bits and bytes in one way or another.

Maybe one solution we could try is to split this into two parts:
- How to do string matching by comparing codepoints.
- How to compare codepoints by comparing bits.
The later would be marked as 'just one way to do it'.

What do you think?

>C072: Semi-satisfied.  Does the charmod contain a discussion of the
>subtle-but-nonzero differences between 10646 and Unicode?   I note that
>this is touched on in the response to C128, and the point that the
>Unicode spec is well-written, useful, available on-line or in an
>excellent book is also worth making.  Clearly this meta-reference stuff
>is material to charmod's readers.

We discussed this on the TAG. I think you agreed that you would
look at this again, and give us some concrete suggestions of
what we are missing. We would need this in no less than a few
days if we want to include it into our next publication.

Regards,    Martin.

>C073: Satisfied
>C074: Pending not-yet-made edit, but it sounds like we're probably OK
>C079: Really a special case of C074, but satisfied.
>
>I think that C071 and 072 might be worth a couple of minutes of the
>TAG's time. -Tim
>

Received on Friday, 6 February 2004 18:02:38 UTC