- From: Bert Bos <bert@w3.org>
- Date: Thu, 6 Nov 2008 11:12:14 +0100
- To: www-style@w3.org
On Thursday 25 September 2008 01:24, Zack Weinberg wrote:
> There are a number of ambiguities in the specification of
> unicode-range: descriptors and UNICODE-RANGE tokens. Most are
> relevant only to css3-fonts, but two relate to the core syntax and
> are therefore relevant to css2.1 as well.
>
> The regular expression defining UNICODE-RANGE in CSS2.1 is
>
> U\+[0-9a-f?]{1,6}(-[0-9a-f]{1,6})?
>
> Core syntax issue 1 (editorial, one hopes): The initial U is in upper
> case. All other core lexical productions are written entirely in
> lower case. 4.1.3 bullet point 1 assures us that CSS is entirely
> case- insensitive; I am assuming this is not a (unique) exception to
> that rule. For consistency, the U should be changed to lower case.
> If it *is* meant to be an exception, there should be explicit wording
> in both css-2.1 and css3-fonts that says so.
The U is uppercase only because that is how it usually written, e.g.,
U+0048 instead of u+0048; not because the lowercase is invalid. If that
causes confusion, I'm happy to change the "U" to a "u" in the grammar.
It is indeed purely editorial.
>
> Possible core syntax issue 2: This regular expression will match
> two classes of token which do not conform to any of the three
> basic forms called out in the current ED of css3-fonts:
>
> U+1?10 question marks are not (all) trailing
> U+A?-BF both trailing question marks and a second endpoint
>
> I believe it is not possible to exclude all tokens in these classes,
> and still express all the existing constraints on UNICODE-RANGE
> tokens, using only Lex-style regular expression productions; in
> particular, it is not simultaneously possible to limit the first
> number to no more than 6 characters and specify that all question
> marks must trail.
>
> So I recommend that the core syntax be left alone here. Instead,
> css3-fonts should say that any UNICODE-RANGE token that does not fit
> one of the three basic forms triggers a parse error (thus, the entire
> descriptor is discarded).
>
> [Aside: css3-fonts is almost entirely lacking in formal grammar
> rules. It would be nice if they got added.]
It didn't seem worth it to try and write a pattern that matches only
those UNICODE_RANGE tokens that make sense. It may be possible, but the
pattern would certainly be quite unreadable. So it was left to the text
to explain that certain UNICODE_RANGE tokens are meaningless. That text
was then left out of CSS 2.1, because UNICODE_RANGE is not used there.
How to handle those well-formed but meaningless tokens will indeed have
to be explained in css3-fonts.
So I agree: there is something to do for css3-fonts[1], but nothing for
CSS 2.1.
[1] http://dev.w3.org/csswg/css3-fonts/
> ----
[description of different cases omitted]
Makes sense. I'll leave it to the editors of the fonts module to
suggest some text.
> ----
>
> There is also a question of what text is produced by a CSSOM query
> for the value of an arbitrary unicode-range: descriptor. I recommend
> that implementations be allowed, but not required, to produce a
> simplified representation of the range instead of the original text.
> Continuing with the example of
>
> unicode-range: U+00??, U+0080-01FF;
>
> an implementation should be allowed to produce (at least) any of
> these:
>
> U+00??, U+0080-01FF; // exactly the original text
> U+0000-00FF, U+0080-01FF; // question marks expanded to pairs
> U+00??, U+01??; // normalized to question mark form
> U+0000-00FF, U+0100-01FF; // normalized to pair form
> U+0000-01FF; // optimized
>
> I don't think the spec needs to enumerate possibilities; just mention
> that implementations have license in this area.
>
> I would be happy to come up with wording for any or all of the above
> changes.
I have no preference. There is a section on normalization in the CSSOM
and such a text could probably be added there. See
http://dev.w3.org/csswg/cssom/#parsing
Bert
--
Bert Bos ( W 3 C ) http://www.w3.org/
http://www.w3.org/people/bos W3C/ERCIM
bert@w3.org 2004 Rt des Lucioles / BP 93
+33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France
Received on Thursday, 6 November 2008 10:12:55 UTC