Re: [css-fonts-3] i18n-ISSUE-295: U+ in unicode-range descriptor from John C Klensin on 2013-09-13 (www-international@w3.org from July to September 2013)

From: John C Klensin <john+w3c@jck.com>
Date: Fri, 13 Sep 2013 06:47:13 -0400
To: John Daggett <jdaggett@mozilla.com>, Richard Ishida <ishida@w3.org>
cc: W3C Style <www-style@w3.org>, www International <www-international@w3.org>
Message-ID: <31A86E347FD35878E4C79581@JcK-HP8200.jck.com>

--On Thursday, September 12, 2013 18:45 -0700 John Daggett
<jdaggett@mozilla.com> wrote:

> 
> Richard Ishida wrote:
> 
>> 4.5. Character range: the unicode-range descriptor
>> http://www.w3.org/TR/2013/WD-css-fonts-3-20130711/#unicode-ra
>> nge-desc
>> 
>> 'Each <urange> value is a UNICODE-RANGE token made up of a
>> "U+" or "u+"  prefix followed by a codepoint range'. The U+
>> is not always needed  before every codepoint value (eg. in a
>> range).
>> 
>> Why do we need the U+/u+ ?  It would be easier to just use
>> bare hex  codepoints, especially for ranges, where U+ is only
>> used at the start  anyway.
> 
> As Tab has already pointed out, the unicode range syntax was
> part of CSS 2.1 syntax and the descriptor itself is already
> supported by multiple implementations so it's not appropriate
> to make a change like this at this point.

After thinking about this a bit more, there is another reason.
U+[N[N]]NNNN rather clearly identifies a Unicode code point --
independent of the particular encoding/representation -- in
general practice.  By contrast, "0x...." and its syntactic
equivalents takes us back into the question of whether it is a
Unicode code point or, e.g., UTF-16 or hexified UTF-8.   So
there is also a slight argument for U+.... on grounds of clarity
and precision.

    john

Received on Friday, 13 September 2013 10:47:41 UTC