Re: Posted draft of URI comparison finding

On Mon, 2002-12-02 at 09:14, Chris Lilley wrote:
> On Monday, December 2, 2002, 3:42:56 PM, Dan wrote:
> DC> |In Unicode terminology, this would be properly referred
> DC> | to as codepoint-for-codepoint comparison.
> DC> Well, it's only codepoint-for-codepoint after you map
> DC> the charcters to codepoints; character-for-character
> DC> is just as proper, no?
> No.
> Characters are defined as unicode codepoints.

Really? where? I understood codepoints to *correspond*
to characters, but not to *be* characters.

"Each character in the repertoire is then associated with a
(mathematical, abstract) non-negative integer, the code point (also
known as a character number or code position). The result, a mapping
from the repertoire to the set of non-negative integers, is called a
coded character set (CCS)."

> What byte sequences
> these codepoints become in various encodings is orthogonal, but a
> given character has a unique unicode codepoint.

I don't understand your point; I don't see how byte sequences
are relevant.

If you have two character sequences, I still think it
it's proper to speak of comparing them
character-for-character. It's reasonably clear
that this gives the same result as mapping
the character sequence to a codepoint sequence
and then comparing codepoint-for-codepoint,
but to speak of comparing character sequences
codepoint-for-codepoint is a little sloppy, no?

Dan Connolly, W3C

Received on Monday, 2 December 2002 12:43:53 UTC