W3C home > Mailing lists > Public > www-style@w3.org > July 2008

Re: [CSS21] Escaping end comment delimiters inside comments

From: Bert Bos <bert@w3.org>
Date: Fri, 25 Jul 2008 19:43:42 +0200
Cc: "www-style@w3.org" <www-style@w3.org>
To: Undisclosed.Recipients: ;
Message-Id: <200807251943.42263.bert@w3.org>

On Tuesday 22 July 2008 21:49, Arron Eicholz wrote:
> There seems to be a conflict in the tokenization and prose about
> escapes.
>
> http://www.w3.org/TR/CSS21/syndata.html#characters
>
>   # In CSS 2.1, a backslash (\) character indicates three types
>   # of character escapes.
>   # ...
>   # Second, it cancels the meaning of special CSS characters.
>   # Any character (except a hexadecimal digit) can be escaped
>   # with a backslash to remove its special meaning.
>
> This means
>   P { /*\*/*/ color: orange; }
> Would display as orange.
>
> http://www.w3.org/TR/CSS21/syndata.html#tokenization
>
>   #  COMMENT    \/\*[^*]*\*+([^/*][^*]*\*+)*\/
>
> Notice the COMMENT token does not include {escape}. Parsing
> According to this tokenization would mean that
>   P { /*\*/*/ color: orange; }
> would not display as orange.
>
> Test case:
> http://lists.w3.org/Archives/Public/www-archive/2008Jul/att-0060/esca
>ped-comment.htm
>
> Firefox, Opera, Safari follow the tokenization rules.
> IE7 follows the prose rules.
>
> Proposal:
>   In 4.1.3 prepend
>     "Except within CSS comments"
>   to the sentence
>     # Any character (except a hexadecimal digit) can be escaped
>     # with a backslash to remove its special meaning.
>
> We note that for parsing style sheets in a renderer, whether a
> Unicode escape is recognized or not doesn't matter, but if there's
> a CSSOM API for accessing comments, we should say somewhere that
> Unicode escapes are processed within CSS comments. This will allow
> serializing */ inside comments.

So that's an argument for changing the token rather than the prose...

I have no opinion on which way to fix it.

If we change the regular expression for the token, I think it would 
become this:

    COMMENT  \/\*([^*\\]|{escape})*\*+(([^/*\\]|{escape})[^*]*\*+)*\/

There might be ways to write this in a more readable way, but some quick 
testing seems to indicate that this indeed works on all of the 
following:

    /*\*/*/
    /***\*/*/
    /**\/*/
    /*\/*/
    /*/\/*/
    /*/*\/*/
    /*/*\/\2A*/
    /*/*\/\2A/*/



Bert
-- 
  Bert Bos                                ( W 3 C ) http://www.w3.org/
  http://www.w3.org/people/bos                               W3C/ERCIM
  bert@w3.org                             2004 Rt des Lucioles / BP 93
  +33 (0)4 92 38 76 92            06902 Sophia Antipolis Cedex, France
Received on Friday, 25 July 2008 17:44:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 April 2009 13:55:10 GMT