- From: John Daggett <jdaggett@mozilla.com>
- Date: Wed, 14 Dec 2011 20:57:08 -0800 (PST)
- To: MURATA Makoto <eb2m-mrt@asahi-net.or.jp>
- Cc: www-style@w3.org
Makoto Murata wrote: > If grapheme clusters, word boundaries, and Unicode normalizations are > incorporated, the result will be very complicated. Note that Unicode > regular expressions Level 1 (Unicode Technical Standard #18) > significantly simplifies grapheme clusters and word boundaries. > > The smallest generic solution is one-to-one mapping of UCS code > values. I would be a small subset of your "convert". I think that it > would be very appropriate as Level 1 of text transformation. I think this whole issue is a bit of a red herring. Yes, it would be better if the wording explicitly states what to do in the presence of combining characters. But that's true in other cases such as selectors too, there's no description of how identifiers that use combining characters are matched. We also had a nice long discussion of normalization as part of Selectors 3. I think the conclusion was that it's not a problem in practice. I think the same is true here. I should also point out that this is already an issue with the way CSS3 Text defines the text-transform property itself, there's no description of whether normalization should occur in the presence of combining characters or not. My guess is that all user agents today only transform base characters without doing any normalization, such that <base> + <combining> simply becomes T<base> + <combining>. I think a very simple version of @text-transform is possible to define in the CSS3 Text timeframe. But we won't know unless we try. A simple one-to-one character mapping is the way to go, as Murata-san suggests. It wouldn't be a terrible thing to simply say that at this level, transforms defined with @text-transform are only defined for the base characters within clusters and no normalization is assumed or required, just as user agents do already for the other predefined transforms. We can deal with support for more complex situations at a later time, based partly on whether or not there are real issues in practice. Regards, John Daggett
Received on Thursday, 15 December 2011 04:57:37 UTC