Re: [css-fonts-3] i18n-ISSUE-296: Usable characters in unicode-range from Jonathan Kew on 2013-09-13 (www-international@w3.org from July to September 2013)

From: Jonathan Kew <jfkthame@googlemail.com>
Date: Fri, 13 Sep 2013 09:46:58 +0100
To: Anne van Kesteren <annevk@annevk.nl>
CC: John Daggett <jdaggett@mozilla.com>, Addison Phillips <addison@lab126.com>, Richard Ishida <ishida@w3.org>, W3C Style <www-style@w3.org>, www International <www-international@w3.org>
Message-ID: <5232D102.2000107@gmail.com>

On 13/9/13 09:05, Anne van Kesteren wrote:
> On Fri, Sep 13, 2013 at 5:46 AM, John Daggett <jdaggett@mozilla.com> wrote:
>> Hmmm.  "Valid Unicode codepoint" seems fine to me, it's talking about the
>> codepoint, not whether there's a character represented by that or not.
>> But I'm not going to quibble, I've updated the spec to remove the term.
>
> Well the difference matters. Can we render any code point, or do we
> only render Unicode scalar values (code points minus lone surrogates).
> I'm kinda hoping the latter, but I'm pretty sure in Gecko at least
> it's the former. Whether unicode-range should support lone surrogates
> might be separate from that I suppose.

Given that it's possible for script to insert lone surrogates into the 
DOM, I think we have to "render" them in some way - though simply 
rendering a hexbox, a "broken character" graphic, or perhaps U+FFFD, 
would be sufficient; there's no need to even attempt font matching as 
though they were actual characters.

Unicode-range, OTOH, is expressing a range of Unicode scalar values for 
which the font should be considered in the font matching process. A font 
whose unicode-range is U+0-FFFF, for example, covers the lone-surrogate 
values, but as font selection operates on characters (and clusters), not 
on code units, they'd never actually be used.

JK

Received on Friday, 13 September 2013 08:47:28 UTC