- From: Anne van Kesteren <annevk@annevk.nl>
- Date: Tue, 17 Sep 2013 08:20:34 -0400
- To: John Daggett <jdaggett@mozilla.com>
- Cc: Addison Phillips <addison@lab126.com>, Richard Ishida <ishida@w3.org>, W3C Style <www-style@w3.org>, www International <www-international@w3.org>
On Mon, Sep 16, 2013 at 9:55 PM, John Daggett <jdaggett@mozilla.com> wrote: > In particular, I think Anne's point about surrogate handling [1] is > completely orthogonal to the behavior of unicode-range: > >> It seems weird to say it expresses a range of Unicode scalar values >> and then include U+D800 to U+DFFF in that range. And let's not use >> "characters" as that's a confusing term. Saying that the range is in >> code points but U+D800 to U+DFFF are ignored (rather than treated as >> an error) could make sense. > > Non-Unicode encoding and surrogate handling issues are dealt with in > levels above the level where font matching occurs. If you look > carefully at the description of font matching, the range of codepoints > defined by the 'unicode-range' descriptor is intersected with the > underlying character map of the font. *That* is what defines the > exact set of codepoints that are matched as part of the font matching > algorithm. Given that no font ever includes mappings for surrogate > codepoints to glyphs and no layout engine ever treats lone surrogates > as individual codepoints, I don't see the need to adjust the > definition of 'unicode-range'. Invalid codepoints like this will > naturally be ignored given the existing definition of font matching. My point may be orthogonal, or not, but your terminology confusion is not helping. A surrogate code point is not an invalid code point, it's perfectly valid. It's just not a Unicode scalar value and not part of the value space of utf-8 or utf-16. I'm fine with describing unicode-range in terms of code points and matching against a font's code points, given that nobody really seems to know what the value space of the latter is. (We should define that though, some day. What a font on the platform actually is at an abstract level. And how the various formats map to it, how CSS uses it, etc.) > [1] http://lists.w3.org/Archives/Public/www-style/2013Sep/0318.html -- http://annevankesteren.nl/
Received on Tuesday, 17 September 2013 12:21:09 UTC