- From: David Clarke <w3@dragonthoughts.co.uk>
- Date: Fri, 06 Feb 2009 10:34:28 +0000
- To: Henri Sivonen <hsivonen@iki.fi>
- CC: public-i18n-core@w3.org, "'W3C Style List'" <www-style@w3.org>
Henri, In an ideal world, fixing all the IME systems to produce normalised results would be great, but highly impractical. Decisions would also need to be made regarding which normalized from is the "correct one", and those decisions would need to be complied with. I was always taught that in software development you should be very tolerant of external data (e.g. look at how well browsers deal with broken HTML), and strict and consistent on your output. - in short don't rely on other developers doing the right thing and, your thought is that we would need to fix all the IMEs that already exist. If on the other hand we propose standards where the lack of normalisation of is tolerated, but require late normalisation, we can produce a functional result. As they stand, the normalization algorithms, and checks, are fast to execute if the input is already normalised to their form. With this in mind, the majority of the performance hit would only come when non-normalised data is presented. I generally prefer to have my software work well and consistently without surprises, and performance has to be secondary to that. Of course I come from the school of defensive coding. - Maybe Moore's law will solve the performance issue, but only tolerant coding and late normalisation can ensure that the software functional and reliable. Henri Sivonen wrote: > > On Feb 5, 2009, at 19:23, Richard Ishida wrote: > >> Well, if you speak and think in excellent English there's no big deal >> with codepoint for codepoint comparison. But if you speak and think >> in Vietnamese, Burmese, Khmer, Tamil, Malayalam, Kannada, Telugu, >> Sinhala, Tlįchǫ Yatìi, Dënesųłįne, Dene Zhatié–Shihgot’ine, Gwich’in, >> Dɛnɛsųłįnɛ, Igbo, Yoruba, Arabic, Urdu, Azeri, Tibetan, Japanese, >> Chinese, Russian, Serbian, etc. etc. and especially if your content >> is in that language, then it wouldn't be so surprising that you would >> want to write class names and ids in that language too, and I think >> we need to investigate what is needed to support that. > > Using class names or ids made of words in those languages is enabled. > It's just that inconsistent defects in text input software may lead to > surprises in some cases. However, to get rid of the surprises, the > text input methods should be fixed instead of complicating other > software. > --- David Clarke
Received on Friday, 6 February 2009 10:35:18 UTC