- From: Anne van Kesteren <annevk@annevk.nl>
- Date: Fri, 13 Sep 2013 10:01:52 +0100
- To: Jonathan Kew <jfkthame@googlemail.com>
- Cc: John Daggett <jdaggett@mozilla.com>, Addison Phillips <addison@lab126.com>, Richard Ishida <ishida@w3.org>, W3C Style <www-style@w3.org>, www International <www-international@w3.org>
On Fri, Sep 13, 2013 at 9:46 AM, Jonathan Kew <jfkthame@googlemail.com> wrote: > Given that it's possible for script to insert lone surrogates into the DOM, > I think we have to "render" them in some way - though simply rendering a > hexbox, a "broken character" graphic, or perhaps U+FFFD, would be > sufficient; there's no need to even attempt font matching as though they > were actual characters. So Gecko doesn't do U+FFFD, but I would prefer it if we did. In particular, if the rendering subsystem would only operate on Unicode scalar values and treat lone surrogates as errors, I think that'd be an improvement. > Unicode-range, OTOH, is expressing a range of Unicode scalar values for > which the font should be considered in the font matching process. A font > whose unicode-range is U+0-FFFF, for example, covers the lone-surrogate > values, but as font selection operates on characters (and clusters), not on > code units, they'd never actually be used. It seems weird to say it expresses a range of Unicode scalar values and then include U+D800 to U+DFFF in that range. And let's not use "characters" as that's a confusing term. Saying that the range is in code points but U+D800 to U+DFFF are ignored (rather than treated as an error) could make sense. -- http://annevankesteren.nl/
Received on Friday, 13 September 2013 09:02:22 UTC