Re: [css3-text] Splitting CSS Text into Level 3 and Level 4 from Florian Rivoal on 2011-12-14 (www-style@w3.org from December 2011)

From: Florian Rivoal <florianr@opera.com>
Date: Wed, 14 Dec 2011 15:27:27 +0100
To: www-style@w3.org
Message-ID: <op.v6hg31vm4p7avi@localhost.localdomain>

On Wed, 14 Dec 2011 14:48:28 +0100, MURATA Makoto  
<eb2m-mrt@asahi-net.or.jp> wrote:

> If grapheme clusters, word boundaries, and Unicode normalizations are
> incorporated, the result will be very complicated.

The idea of using grapheme clusters is to make it magically do the right
thing for authors. The word would scare authors away, but the behavior
would just make authors' intuitive understanding of what a character
is match with what the transform considers a character.

As for how word boundaries and unicode transformations, I am not quite
sure how you can be convinced they would make the feature hard to use
before we even decide on what they are supposed to do and how they should
work.

> Note that Unicode
> regular expressions Level 1 (Unicode Technical Standard #18)  
> significantly simplifies grapheme clusters and word boundaries.

Thanks for the link, I'll read up on that.

> The smallest generic solution is one-to-one mapping of UCS code values.
> I would be a small subset of your "convert".  I think that it would be
> very appropriate as Level 1 of text transformation.

Operating on single unicode codepoints isn't s simpler subset of operating
on grapheme cluster, but rather an incompatible variant. Maybe using  
grapheme
clusters here is wrong, and we should go for single code points, but we
should not rush into using one definition of character while suspecting  
we'll
eventually want to use another one, as that would break content built using
the earlier definition.

  - Florian

Received on Wednesday, 14 December 2011 14:28:03 UTC