- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Thu, 04 Dec 2008 13:13:16 +0900
- To: "Phillips, Addison" <addison@amazon.com>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
+1 At 05:06 08/12/04, Phillips, Addison wrote: >(proposed response follows to be discussed as an agenda+ today) > >Addison Phillips >Globalization Architect -- Lab126 > >Internationalization is not a feature. >It is an architecture. > >-- > >Hello Lofton, > >Thanks for the note on WebCGM 2.1. > >The I18N Core WG generally recommends using Unicode Normalization Form C >(NFC) for normalization-sensitive operations such as string comparison. >While this isn't always the right choice, it appears to us that it makes >the most sense for font name matching for these reasons: > > - Most files and font names will probably already use NFC, so the need to >actually normalize strings will be reduced. (Checking normalization is >faster and easier than performing it) > - Any file that uses ISO 8859-1 (Latin-1) as its encoding, for example, is >already in NFC. > - NFC is generally considered a non-destructive normalization, unlike the >compatibility forms NFKC and NFKD. > >Please note that case-insensitive comparison is not addressed by Unicode >normalization. > >For specific information on normalization, you can reference both the >Unicode Standard Annex and the W3C Character Model, Part 2 (Normalization). >The latter is still a working draft and is being revised currently. > >Best Regards (for I18N Core), > >Addison > >// etc. > > >-----Original Message----- >From: public-i18n-core-request@w3.org >[mailto:public-i18n-core-request@w3.org] On Behalf Of Lofton Henderson >Sent: Wednesday, December 03, 2008 10:58 AM >To: ishida@w3.org >Cc: public-webcgm-wg@w3.org; public-i18n-core@w3.org >Subject: Re: [WebCGM2.1][LC Review] i18n comment 6: Unicode normalization > > >Hello, and thanks for the helpful I18N comments on the WebCGM 2.1 Last Call >review. > >After some research into the details of Comment #6 -- that WebCGM should >use a Unicode normalization form for font-name-string comparisons -- we see >the wisdom of it for reliable matching. But lacking deep expertise on the >topic, we'd welcome further advice. > >Question: Do you have a recommendation for which of the four normalization >forms would be best? > >For background, recall that the subject string comparison is seeking a >match between: on the one hand, a font-name-string as extracted from a >WebCGM instance; and on the other hand, a font-name-string from the ACL >file (a separate XML file) that specifies the font-name to be matched. > >We would expect Unicode normalization to potentially make a difference in >those cases wherein the first string (font-name from WebCGM instance) is >outside the well-defined core set of thirteen (13) fixed names that are >required by the WebCGM standard. The character encoding in the WebCGM >instance will be either ISOLatin1, or Unicode UTF8 or UTF16. > >If the answer is not simple enough for efficient email resolution, we would >welcome your participation in our teleconference of Thursday, 04-dec, 11am >EST. (Or alternately two weeks later if you can't make tomorrow.) Please >let me know, and I will send telecon logistics. > >Thanks, >-Lofton Henderson >(Chair WebCGM WG) > > >At 10:29 AM 11/11/2008 +0000, ishida@w3.org wrote: > >>Comment from the i18n review of: >>http://www.w3.org/TR/2008/WD-webcgm21-20080917/WebCGM21-Config.html#ACI-fontmap >> >>Comment 6 >>At http://www.w3.org/International/reviews/0811-webcgm/ >>Editorial/substantive: S >>Tracked by: RI >> >>Location in reviewed document: >>9.3.2.2 >>[http://www.w3.org/TR/2008/WD-webcgm21-20080917/WebCGM21-Config.html#ACI-maplist] >> >>Comment: >>Normalization for string comparison should include conversion to a Unicode >>normalization form, to eliminate issues related to precomposed vs. >>decomposed characters and issues related to ordering of multiple combining >>characters. >> >> #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Thursday, 4 December 2008 08:02:16 UTC