- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Wed, 4 Feb 2009 13:23:04 +0200
- To: Robert J Burns <rob@robburns.com>
- Cc: public-i18n-core@w3.org, W3C Style List <www-style@w3.org>
On Feb 3, 2009, at 20:51, Robert J Burns wrote: >> Making existing browsers normalize before string equality checks is >> also too late. > > But doing so in the parser as I and others have suggest should work > fine. For clarity, I meant that browsers that are out there are out there. You can't make them do a normalization step anywhere in their processing before an identifier comparison happens. I think we don't have performance data showing that normalization in the parser would be "fine" in terms of performance, and without data it is quite reasonable to assume unfavorable performance characteristics. > Unicode depends on two canonically equivalent but byte-wise > different strings matching. No, it doesn't. Unicode itself doesn't depend on one kind of equality check. Unicode enables a wide variety of equality relations between strings. Different equality relations are appropriate for different purposes. > We cannot hope to eliminate such strings from the internet, so this > is something that implementations have to deal with. I think most > everyone here is on the same page on that, but I want you to > understand too. One way of dealing with it is specifying that implementations do their string identity comparisons code point for code point thus making comparisons between strings that differ in normalization evaluate to false uniformly across implementations. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Wednesday, 4 February 2009 11:23:49 UTC