Re: [css3-fonts] Error handling of unicode-range from John Daggett on 2013-05-23 (www-style@w3.org from May 2013)

From: John Daggett <jdaggett@mozilla.com>
Date: Wed, 22 May 2013 20:41:33 -0700 (PDT)
To: www-style <www-style@w3.org>
Message-ID: <211449704.17297296.1369280493906.JavaMail.root@mozilla.com>

Simon Sapin wrote:

> I like the general direction of today’s edits on unicode-range, but I’m 
> still a bit confused by this paragraph:
> 
> > For interval ranges, the start and end codepoints must be valid
> > Unicode values and the end codepoint must be greater than or equal
> > to the start codepoint. Wildcard ranges specified with ‘?’ that
> > lack an initial digit (e.g. "U+???") are valid and treated as if
> > there was a single 0 before the question marks (thus, "U+???" =
> > "U+0???" = "U+0000-0FFF"). "U+??????" is not a syntax error, even
> > though "U+0??????" would be. Wildcard ranges that extend beyond
> > the end of valid codepoint values are clipped to the range of
> > valid codepoint values. Ranges that do not conform to these
> > restrictions are considered parse errors and the descriptor is
> > omitted.
> 
> In particular, it’s not clear what exactly is the error handling in 
> various cases. As I understand it, there are two possible ways to handle 
> some of the "bad" ranges, and "omitted" could mean either:
> 
> a. Drop the whole declaration. Other specs often say "invalid" for this, 
> sometimes referencing one of these:
> 
> http://www.w3.org/TR/CSS21/syndata.html#illegalvalues
> http://www.w3.org/TR/CSS21/conform.html#ignore
> 
> b. Consider that a given unicode-range token represents an empty range. 
> The overall value of the descriptor being the union of all ranges, the 
> empty range is neutral.
> 
> I think that changing the terminology to "invalid declaration" and 
> "empty range" would help.

I revised this again.  After thinking about it a bit more, I've simply
made it so that any range that is not equivalent to a single codepoint
or an ascending range of valid Unicode codepoints is not valid and the
entire declaration is ignored. By doing this we avoid having to worry
about how to serialize declarations that produce empty ranges of valid
Unicode codepoints (e.g. 'unicode-range: u+11????).

So a 'unicode-range' descriptor declaration will now define either a
non-empty set of valid Unicode codepoints or it will be invalid.

Regards,

John Daggett

Received on Thursday, 23 May 2013 03:42:01 UTC