Re: Escaping characters in CSS2 from Bert Bos on 1999-03-12 (www-style@w3.org from March 1999)

From: Bert Bos <Bert.Bos@sophia.inria.fr>
Date: Fri, 12 Mar 1999 14:20:36 +0100 (MET)
To: "L. David Baron" <dbaron@fas.harvard.edu>
Cc: www-style@w3.org
Message-ID: <14056.61879.110307.460526@www43.inria.fr>

L. David Baron writes:
 > In the general CSS tokenization rules [1] in CSS2, an escape is written
 > as "{unicode}|\\[ -~\200-\4177777]" (where unicode is "\\[0-9a-f]{1,6}[
 > \n\r\t\f]?").  However, in the text [2], it says that
 > 
 >   Any character (except a hexadecimal digit) can be escaped with a
 >   backslash to remove its special meaning.
 > 
 > This makes me think that the definition of escape should instead be:
 > 
 >   {unicode}|\\[ -/:-@G-`g-~\200-\417777]
 > 
 > which does not allow hexadecimal digits [0-9a-zA-Z] to be escaped.  The
 > same exact thing occurs in Appendix D.2 [3].
 > 
 > However, I don't know much about flex notation, so perhaps the first
 > things that can match will (or something like that).  Is that true
 > (in which case this isn't a problem)?  It might be clearer the above
 > way anyway.

You are right, the notation relies on a (F)lex feature: longest match
wins. Your version is indeed more explicit.

 > 
 > David Baron
 > 
 > [1] http://www.w3.org/TR/REC-CSS2/syndata.html#tokenization
 > [2] http://www.w3.org/TR/REC-CSS2/syndata.html#q4
 > [3] http://www.w3.org/TR/REC-CSS2/grammar.html#q2
 > 

-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos/                              W3C/INRIA
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France

Received on Friday, 12 March 1999 08:20:48 UTC