Re: [css21] ident production does not match <identifier>

On Monday 15 February 2010 18:16:09 Anne van Kesteren wrote:
> On Mon, 15 Feb 2010 18:08:35 +0100, Jonathan Kew
> <jonathan@jfkew.plus.com>
>
> wrote:
> > On 15 Feb 2010, at 16:24, Anne van Kesteren wrote:
> >> http://www.w3.org/TR/CSS21/syndata.html#value-def-identifier
> >> mentions U+00A1 which has number 161 and seems about right. Yet
> >> the grammar defines nonascii as anything beyond 177 which is
> >> U+00B1 which does not really make sense to me. Thinking about it
> >> some more explicitly excluding 127-160 does not really seem needed
> >> either to me and maybe they should become part of nonascii (would
> >> also make the name somewhat more logical).
> >>
> >> Am I missing something?
> >
> > I take it you're referring to the line
> >
> >  nonascii [^\0-\177]
> >
> > The \177 there is an OCTAL escape, so that means 127 decimal / 0x7f
> > hex, which is correct for the ASCII range.
>
> Ah, thanks, so it's the U+00A1 reference that's weird.

If I remember correctly, the answer is that the two phrases "characters 
above U+007F" and "characters U+00A1 and higher" mean the same thing, 
because A1 is actually the first character above 7F.

In more detail:

Before Unicode 3, the code points between 80 and 9F didn't even have a 
name. Now most of them have been given a name, but they remain 
classified as control codes, not characters. Unicode says that the 
meaning of control codes (the 65 code points 00-1F and 7F-9F) depends 
on "a higher-level protocol," not on Unicode.

CSS is such a higher-level protocol. It assigns meaning to 09, 0A, 0C 
and OD, but does not use the other control codes.



Bert
-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos                               W3C/ERCIM
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France

Received on Monday, 15 February 2010 18:41:13 UTC