Re: [css21] ident production does not match <identifier> from Bert Bos on 2010-02-17 (www-style@w3.org from February 2010)

From: Bert Bos <bert@w3.org>
Date: Wed, 17 Feb 2010 14:37:07 +0100
To: W3C Emailing list for WWW Style <www-style@w3.org>
Message-Id: <201002171437.07266.bert@w3.org>

On 15/2/10 20:07, Zack Weinberg wrote:
> Bert Bos <bert@w3.org> wrote:
>>>> The \177 there is an OCTAL escape, so that means 127 decimal /
>>>> 0x7f hex, which is correct for the ASCII range.
>>>
>>> Ah, thanks, so it's the U+00A1 reference that's weird.
>>
>> If I remember correctly, the answer is that the two phrases
>> "characters above U+007F" and "characters U+00A1 and higher" mean the
>> same thing, because A1 is actually the first character above 7F.
> 
> Is there any reason to exclude U+00A0 (NO-BREAK SPACE)?  It is
> whitespace rather than a glyph, but that doesn't stop us when it comes
> to all the other Unicode-but-not-ASCII whitespace code points
> (including U+00AD SOFT HYPHEN)...

I think (but I'm making this up, I don't actually remember) that we 
excluded no-break space simply because (1) it was easy to do, and (2) 
no-break space is a character you can actually type on some keyboards. 
Most of the other characters that one would be unwise to use are quite 
hard to type. You wouldn't type an EN SPACE by accident...

But it's a a good question why the grammar doesn't say "[^\0-\240]". 
That's not any longer than "[\0-\177]" and it would match the English 
text better.

I don't know the answer. It's too long ago. The grammar and the text 
were already like this in the first draft of CSS2 in 1997. Maybe an 
oversight, maybe just laziness.

We could change the definition of "nonascii" in 4.1.1 to "[^\0-\240]"
although the name "nonascii" in that case becomes a bit strange...

I don't think we can do the opposite, i.e., change the text in 4.1.3 to
allow no-break space (A0) in identifiers. That would be a change, rather
than a clarification.

Bert
-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos                               W3C/ERCIM
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France

Received on Wednesday, 17 February 2010 13:37:29 UTC