2009/2/6 Robert J Burns <rob@robburns.com>: > Another singleton example is: > > 1) 慈 (U+2F8A6) [non-normalized] > 2) 慈 (U+6148) [NFC and NFD] > > I note the font HiraKakuProN-W3 on my system presents these with slightly > different glyphs which as i said before should be considered a bug (but like I disagree here. The whole point of the U+2Fxxx block of "compatibility ideographs" is to allow one to specify a particular form when the form actually matters (e.g., when dealing with ancient texts). I ran into U+2F999 just a week ago. (I have to look through the charts to pick out the correct character. This had to be contrasted with U+831D which is the normalized form, and the content that I had to mark up actually says something to the effect of "U+831D is probably an erraneous form of U+2F999…". This would make no sense if the two glyphs show up the same). Therefore the fonts MUST display the two differently; I would consider it a bug if U+2F999 looks the same as U+831D. My personal opinion regarding CJK unification is that it's an inconsistent mess. But that'd be off-topic here. > input systems, font makers really have not gotten clear norms about this) At > least in the case of the name of this character ("CJK COMPATIBILITY > IDEOGRAPH-2F8A6"), the name provides some indication of discouraged use > (which may be all an author encounters when using a character input system). > My feeling is that singletons are an ill-conceived part of NFC and NFD > normalization (closer to compatibility decompositions than canonical > decompositions), but that the non-singleton parts of normalization are > essential to proper text handling (and I don't see how Unicode could have > avoided or could avoid in the future such non-singleton canonical > normalization). > > Take care, > Rob > > [1]: > <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:NFC_Quick_Check=No:]> > [2]: > <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:NFC_Quick_Check=Maybe:]> > -- cheers, -ambrose The 'net used to be run by smart people; now many sites are run by idiots. So SAD... (Sites that do spam filtering on mails sent to the abuse contact need to be cut off the net...)Received on Friday, 6 February 2009 22:41:20 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 6 February 2009 22:41:23 GMT