- From: Martin Duerst <duerst@w3.org>
- Date: Thu, 05 Dec 2002 08:08:54 +0900
- To: Tim Bray <tbray@textuality.com>, Dan Connolly <connolly@w3.org>
- Cc: Chris Lilley <chris@w3.org>, www-tag@w3.org
At 09:51 02/12/02 -0800, Tim Bray wrote: >Dan Connolly wrote: > >>If you have two character sequences, I still think it >>it's proper to speak of comparing them >>character-for-character. It's reasonably clear >>that this gives the same result as mapping >>the character sequence to a codepoint sequence >>and then comparing codepoint-for-codepoint, >>but to speak of comparing character sequences >>codepoint-for-codepoint is a little sloppy, no? > >I believe the opposite. A "character" is a complex bundle of visual and >linguistic semantics. A codepoint is a number. I know how to compare >numbers. -Tim As the character model explains (http://www.w3.org/TR/charmod/#sec-Perceptions), the term character is used in many, many different ways. To take an extreme (for most present-day computer-literate people) example, in many ways, what we write 'f' and 'F' are just one and the same character, the character called 'eff' in English. Using 'codepoint' makes clear that we use characters as encoded, and because 'f' and 'F' are encoded differently, we know that they must compare not equal. Regards, Martin.
Received on Wednesday, 4 December 2002 18:50:34 UTC