Re: [css3-text] grapheme clusters across element boundary

On 1/16/12 3:06 PM, Glenn Adams wrote:
> (2) script that naively assumes codepoint = character, and inadvertently
> separates surrogate pair elements;

This is most script.

> regarding (2), my position is that the implementation should be
> conservative and not liberal when allowing script to set certain
> property values whose underlying semantics imply a well-formed UTF-16
> string; so, yes, were I implementing this, I would throw an exception
> when a script attempts to set a DOMString typed property to a JS String
> that contains an isolated surrogate codepoint; or at least I would do
> this by default, and only depart from this default in certain
> circumscribed cases;

And my point is that since pretty much every script handles surrogate 
pairs wrong throwing would just penalize users who try to use non-BMP 
characters with such scripts.  It would particular penalize users whose 
languages are written with non-BMP characters.

Maybe you think it's OK to screw such users over.  I don't.  Especially 
in situations in which the "correct" rendering is obvious (e.g. every 
single codepoint wrapped in its own span, but all have the same style: 
you just render the text as a single text string with that style).

> this is my answer to your question "what the best way to limit damage
> from the lack of understanding on script authors' part is"

I think you and I have different definitions of "damage" here.

-Boris

Received on Monday, 16 January 2012 20:39:57 UTC