The CJK compatibility characters are also variants of the corresponding 'ordinary' character in that either character could appear in either form. As a matter of fact, the glyphic shape of the sources (eg JIS) has changed over time. The Unicode Consortium does recognize that particular glyphic shapes are sometimes important, and has developed a much more comprehensive mechanism to deal with it. See http://unicode.org/reports/tr37/ Mark On Mon, Feb 9, 2009 at 16:16, fantasai <fantasai.lists@inkedblade.net>wrote: > > Martin Duerst wrote: > >> I haven't read everything, but if your claim ("overly-aggressive") >> is true, then early normalization would be better than late matching, >> because it would allow those producers that, for whatever reason, >> insist on that there is a difference to simply not do normalization >> for these codepoints. >> > > The argument is that certain normalization mappings in NFC/NFD > are more like the types of mappings that happen in NFKC/NFKD than > like the compose/decompose/ordering mappings. Therefore early > normalization would cause dataloss in the content, whereas late > matching at, e.g. the selectors level, would avoid such dataloss > while still allowing such strings to match. > > See Ambrose Li's and Robert Burns's comments: > http://lists.w3.org/Archives/Public/www-style/2009Feb/0229.html > > ~fantasai > > >Received on Tuesday, 10 February 2009 21:02:14 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 10 February 2009 21:02:15 GMT