RE: [selectors-api] Selectors API I18N Review... from Richard Ishida on 2009-02-04 (www-style@w3.org from February 2009)

From: Richard Ishida <ishida@w3.org>
Date: Wed, 4 Feb 2009 14:11:03 -0000
To: "'Henri Sivonen'" <hsivonen@iki.fi>
Cc: <public-i18n-core@w3.org>, <www-style@w3.org>
Message-ID: <007901c986d2$6a8c8d20$3fa5a760$@org>

> From: Henri Sivonen [mailto:hsivonen@iki.fi]
> Sent: 03 February 2009 13:45
...
> > You could say that the keyboard *ought* to churn out
> > NFC, but it's too late. People using those keyboards will be
> > producing content that may look different to that created by people
> > using other input methods.
> 
> Making existing browsers normalize before string equality checks is
> also too late.

I'm not sure why.  It seems a lot easier to change the browsers than to change every last input method, past and future. 

> 
> When considering what software to change in a future version, to me it
> seems more sensible to change the software that is less performance-
> critical, is closer to the problem and doesn't depend on wide
> consistent deployment to address the problem for a given Web author.
> That is, it seems more sensible to make the input methods produce
> consistently ordered output. This should be within the realm of
> possibility; after all, producing pre-composed characters with
> European diacritic dead keys is a solved problem.

In an ideal world, I might agree with you.  Although I think the input methods would need to distinguish between content and code.  I don't think we should prevent people from ever writing non-NFC content. They may have good reason for doing so. On the other hand, if you use a text editor like Notepad to write your code, how would you enforce that distinction?  Which leads to my other point, that it's not only input methods and editors that we'd need to change so that all code entering the browser is NFC-normalized, it's people.  Forcing conformance on human beings is a lot more problematic and the results are far less predictable.  But I can't help thinking (still) that all of this would be a non-issue if we could find a way that creates an acceptable hit for performance of treating canonically equivalent strings as canonically equivalent when matching.

RI

Received on Wednesday, 4 February 2009 14:11:06 UTC