- From: Andrey Mikhalev <amikhal@abisoft.spb.ru>
- Date: Wed, 4 Feb 2009 17:47:59 +0300 (MSK)
- To: Robert J Burns <rob@robburns.com>
- cc: public-i18n-core@w3.org, W3C Style List <www-style@w3.org>
On Tue, 3 Feb 2009, Robert J Burns wrote: > Unicode depends on two canonically equivalent but byte-wise different strings > matching. We cannot hope to eliminate such strings from the internet, so this > is something that implementations have to deal with. I think most everyone > here is on the same page on that, but I want you to understand too. well, you have convinced me :) since programming and all modern software based on abstract data types and structural equivalence w/o knowledge of particular data semantic, your "normalization" is worthless. there's no way to detect all points in project when integer expression semantically turns into codepoint and when vector of codepoints semantically turns into "unicode text", making it uncomparable with peers on any mutation. looks like this topic is just seekeing workaround for keyboard/IME developer bugs. > > Take care, > Rob >
Received on Wednesday, 4 February 2009 14:48:45 UTC