Re: [css3-syntax] Null bytes and U+0000 from Glenn Adams on 2012-10-23 (www-style@w3.org from October 2012)

From: Glenn Adams <glenn@skynav.com>
Date: Tue, 23 Oct 2012 12:29:02 +0800
To: Boris Zbarsky <bzbarsky@mit.edu>
Cc: www-style@w3.org
Message-ID: <CACQ=j+cVWDsgDP4GHKv7K8kcmSSMgXBxoCwj-xfe-HEF2V8z7A@mail.gmail.com>

On Tue, Oct 23, 2012 at 11:02 AM, Boris Zbarsky <bzbarsky@mit.edu> wrote:

> Note that \0 or \000000 are not valid hex escapes in CSS2.1
>

given that the 2.1 lex grammar [1] actually matches \0 and \000000, i
wonder how the 4.1.3 language "must not be zero" should be interpreted for
error handling purposes; i.e., should the escape be consumed, and then a
semantic check interpret the whole escape as an error? or should the escape
not be consumed, leaving the next input character at the backslash?

h  [0-9a-f]
nonascii [\240-\377]
unicode  \\{h}{1,6}(\r\n|[ \t\r\n\f])?
escape  {unicode}|\\[^\r\n\f0-9a-f]

[1] http://www.w3.org/TR/CSS2/grammar.html#scanner

also, the 4.1.3 parenthetical "(if a stylesheet does contain a character
with Unicode codepoint zero)"

in the case of "\0" the stylesheet does not contain a NUL, rather it
contains an escaped representation of NUL that itself (the escape) does not
contain NUL

is the parenthetical intended to refer only to

0x0000
0x005c 0x0000

or to

0x005c 0x0030

or to all three of the above?

> which is why Gecko never treats them as hex escapes, and I'm pretty
> surprised that WebKit does so.  Guess we never had a test in the test suite
> for little details like section 4.1.3?  ;)
>

Received on Tuesday, 23 October 2012 04:29:50 UTC