- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Mon, 16 Jan 2012 15:39:28 -0500
- To: Glenn Adams <glenn@skynav.com>
- CC: "Kang-Hao (Kenny) Lu" <kennyluck@csail.mit.edu>, WWW Style <www-style@w3.org>
On 1/16/12 3:06 PM, Glenn Adams wrote: > (2) script that naively assumes codepoint = character, and inadvertently > separates surrogate pair elements; This is most script. > regarding (2), my position is that the implementation should be > conservative and not liberal when allowing script to set certain > property values whose underlying semantics imply a well-formed UTF-16 > string; so, yes, were I implementing this, I would throw an exception > when a script attempts to set a DOMString typed property to a JS String > that contains an isolated surrogate codepoint; or at least I would do > this by default, and only depart from this default in certain > circumscribed cases; And my point is that since pretty much every script handles surrogate pairs wrong throwing would just penalize users who try to use non-BMP characters with such scripts. It would particular penalize users whose languages are written with non-BMP characters. Maybe you think it's OK to screw such users over. I don't. Especially in situations in which the "correct" rendering is obvious (e.g. every single codepoint wrapped in its own span, but all have the same style: you just render the text as a single text string with that style). > this is my answer to your question "what the best way to limit damage > from the lack of understanding on script authors' part is" I think you and I have different definitions of "damage" here. -Boris
Received on Monday, 16 January 2012 20:39:57 UTC