- From: Robert J Burns <rob@robburns.com>
- Date: Tue, 3 Feb 2009 12:51:06 -0600
- To: public-i18n-core@w3.org, W3C Style List <www-style@w3.org>
Hi Henri, > >> On Feb 3, 2009, at 15:10, Richard Ishida wrote: >> >> I didn't have to look hard for a problem. If you install the Tlicho >> (Tłįchǫ or Dogrib) keyboard on Windows (see a picture athttp://rishida.net/scripts/pickers/tl >> ich >> o/) and type the name of the language itself, it comes out in NFD. >> It is also possible to incorrectly order multiple diacritics (ie. >> not even NFD). You could say that the keyboard *ought* to churn out >> NFC, but it's too late. People using those keyboards will be >> producing content that may look different to that created by people >> using other input methods. > > > Making existing browsers normalize before string equality checks is > also too late. But doing so in the parser as I and others have suggest should work fine. > > When considering what software to change in a future version, to me it > seems more sensible to change the software that is less performance- > critical, is closer to the problem and doesn't depend on wide > consistent deployment to address the problem for a given Web author. > That is, it seems more sensible to make the input methods produce > consistently ordered output. This should be within the realm of > possibility; after all, producing pre-composed characters with > European diacritic dead keys is a solved problem. Another common misconception is that normalization is only about combining characters. There are also singletons that are normalized as part of the normalization algorithm. Therefore one cannot simply require input methods to normalize on the fly. And even so, which normalization would that be (since as I said before NFC or NFD is a rather bikeshed-like disagreement). Unicode depends on two canonically equivalent but byte-wise different strings matching. We cannot hope to eliminate such strings from the internet, so this is something that implementations have to deal with. I think most everyone here is on the same page on that, but I want you to understand too. Take care, Rob
Received on Tuesday, 3 February 2009 18:51:45 UTC