- From: Jonathan Kew <jonathan@jfkew.plus.com>
- Date: Tue, 10 Feb 2009 18:37:59 +0000
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: Robert J Burns <rob@robburns.com>, public-i18n-core@w3.org, W3C Style List <www-style@w3.org>
On 10 Feb 2009, at 12:44, Henri Sivonen wrote: > (It seems that the Vietnamese input mode on Mac OS X normalizes to > NFC, by the way. In fact, I wouldn't be at all surprised if Mac OS X > already had solution #1 covered and this was just an issue of other > systems catching up.) It's true that the Vietnamese keyboard layout Apple ships is designed to generate precomposed accented letters, using a dead-key approach. Text typed using this layout will therefore be in NFC. However, this does not mean that other keyboard layouts that can generate Vietnamese text -- for example, a general-purpose "Latin and diacritics" layout for linguistic/technical use -- will do the same, whether on Mac OS X or other platforms. As for other scripts and languages, there are plenty of mainstream shipping keyboard layouts that do not necessarily generate normalized text. For example, staying on Mac OS X, I used the OS's Arabic keyboard layout to type the word مُحَبَّتْ into TextEdit.app. First, I typed it in what most users would consider "natural" or "logical" order, <meem damma hah fatha beh shadda fatha teh sukun>. Then I retyped it with the diacritics in canonical order, <meem damma hah fatha beh fatha shadda teh sukun>. The result is a file where the two "spellings" are preserved, and so a bytewise comparison will find them unequal, even though they look identical (at least with the Unicode-compliant font I'm using) and are defined by Unicode to be canonically equivalent. JK
Received on Tuesday, 10 February 2009 18:39:04 UTC