On 1/16/12 3:06 PM, Glenn Adams wrote: > (2) script that naively assumes codepoint = character, and inadvertently > separates surrogate pair elements; This is most script. > regarding (2), my position is that the implementation should be > conservative and not liberal when allowing script to set certain > property values whose underlying semantics imply a well-formed UTF-16 > string; so, yes, were I implementing this, I would throw an exception > when a script attempts to set a DOMString typed property to a JS String > that contains an isolated surrogate codepoint; or at least I would do > this by default, and only depart from this default in certain > circumscribed cases; And my point is that since pretty much every script handles surrogate pairs wrong throwing would just penalize users who try to use non-BMP characters with such scripts. It would particular penalize users whose languages are written with non-BMP characters. Maybe you think it's OK to screw such users over. I don't. Especially in situations in which the "correct" rendering is obvious (e.g. every single codepoint wrapped in its own span, but all have the same style: you just render the text as a single text string with that style). > this is my answer to your question "what the best way to limit damage > from the lack of understanding on script authors' part is" I think you and I have different definitions of "damage" here. -BorisReceived on Monday, 16 January 2012 20:39:57 UTC
This archive was generated by hypermail 2.4.0 : Monday, 23 January 2023 02:14:08 UTC