- From: Bert Bos <bert@w3.org>
- Date: Wed, 17 Feb 2010 14:37:07 +0100
- To: W3C Emailing list for WWW Style <www-style@w3.org>
On 15/2/10 20:07, Zack Weinberg wrote: > Bert Bos <bert@w3.org> wrote: >>>> The \177 there is an OCTAL escape, so that means 127 decimal / >>>> 0x7f hex, which is correct for the ASCII range. >>> >>> Ah, thanks, so it's the U+00A1 reference that's weird. >> >> If I remember correctly, the answer is that the two phrases >> "characters above U+007F" and "characters U+00A1 and higher" mean the >> same thing, because A1 is actually the first character above 7F. > > Is there any reason to exclude U+00A0 (NO-BREAK SPACE)? It is > whitespace rather than a glyph, but that doesn't stop us when it comes > to all the other Unicode-but-not-ASCII whitespace code points > (including U+00AD SOFT HYPHEN)... I think (but I'm making this up, I don't actually remember) that we excluded no-break space simply because (1) it was easy to do, and (2) no-break space is a character you can actually type on some keyboards. Most of the other characters that one would be unwise to use are quite hard to type. You wouldn't type an EN SPACE by accident... But it's a a good question why the grammar doesn't say "[^\0-\240]". That's not any longer than "[\0-\177]" and it would match the English text better. I don't know the answer. It's too long ago. The grammar and the text were already like this in the first draft of CSS2 in 1997. Maybe an oversight, maybe just laziness. We could change the definition of "nonascii" in 4.1.1 to "[^\0-\240]" although the name "nonascii" in that case becomes a bit strange... I don't think we can do the opposite, i.e., change the text in 4.1.3 to allow no-break space (A0) in identifiers. That would be a change, rather than a clarification. Bert -- Bert Bos ( W 3 C ) http://www.w3.org/ http://www.w3.org/people/bos W3C/ERCIM bert@w3.org 2004 Rt des Lucioles / BP 93 +33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France
Received on Wednesday, 17 February 2010 13:37:29 UTC