Yes. Adding a few facts about NFC, for those not fully acquainted with it. Out of the 100K Unicode characters: - Other than CJK compatibility ideographs, there are (currently) 118 characters that are always transformed by NFC into other characters. - http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[[:nfcqc=no:]-[:name=/CJK%20COMPATIBILITY%20IDEOGRAPH/:]] - The CJK COMPATIBILITY IDEOGRAPHs are a larger set, and will grow over time. - There are a further 102 characters that may or may not be transformed: - http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:nfcqc=maybe :] - Such transformation may be combining with a previous character, or may involve reordering. That is, NFC puts non-spacing characters like *combining acute *and *combining ring below* into a canonical order. - While theoretically NFD and NFC are equally appropriate, in practice NFD is only used internally (the one significant exception I know of is the Apple file system) -- NFC is the form recommended for interchange. Mark On Mon, Feb 2, 2009 at 07:53, Phillips, Addison <addison@amazon.com> wrote: > > Would it be reasonable to also disallow insertion of combining > > characters via such escapes? > > Absolutely not reasonable. Some scripts *require* the use of combining > marks. NFC does not guarantee that no combining marks appear in the text. > Applying NFC only means that any combining marks that can be combined with > their base characters are, in fact, combined. > > Addison > > Addison Phillips > Globalization Architect -- Lab126 > > Internationalization is not a feature. > It is an architecture. > > > >Received on Monday, 2 February 2009 18:01:01 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 2 February 2009 18:01:02 GMT