W3C home > Mailing lists > Public > www-style@w3.org > January 2013

Re: [css3-syntax] Making U+0080 to U+009F "non-ASCII"?

From: Simon Sapin <simon.sapin@kozea.fr>
Date: Sat, 26 Jan 2013 00:05:00 +0100
Message-ID: <51030F9C.90000@kozea.fr>
To: Glenn Adams <glenn@skynav.com>
CC: "Tab Atkins Jr." <jackalmage@gmail.com>, Bjoern Hoehrmann <derhoermi@gmx.net>, www-style list <www-style@w3.org>
Le 25/01/2013 21:05, Glenn Adams a écrit :
>
> On Fri, Jan 25, 2013 at 11:24 AM, Tab Atkins Jr. <jackalmage@gmail.com
> <mailto:jackalmage@gmail.com>> wrote:
>
>     I suspect it's approximately zero compat risk.  I'm willing to make
>     the change iff other browsers are cool with it.  I'd make the change
>     in WebKit, but I can't make heads nor tails of our lexer.
>
>
> WK already treats any UC code point >= 128 as identifier start (vid.
> CSSParser.cpp):
>
> template <typename CharacterType>
> static inline bool isIdentifierStartAfterDash(CharacterType*
> currentCharacter)
> {
>      return isASCIIAlpha(currentCharacter[0]) || currentCharacter[0] ==
> '_' || currentCharacter[0] >= 128
>          || (currentCharacter[0] == '\\' &&
> isCSSEscape(currentCharacter[1]));
> }


Nice! No interop means we can do whatever we want, right?

Test case:

data:text/html;charset=utf8,<html 
class=%C2%80><style>.%C2%80{background:green

Green means that U+0080 is accepted as part of an identifier, otherwise 
the selector is invalid because it’s made of two delim tokens. Chromium 
and Opera show green, but not Firefox.

To double check with CSSOM, this is 128 = 0x80 when U+0080 is an ident:
document.styleSheets[0].cssRules[0].selectorText.charCodeAt(1)


For comparison, an unescaped U+007F is never an ident token:

data:text/html;charset=utf8,<html class=%C2%7F><style>.%C2%7F{background:red

-- 
Simon Sapin
Received on Friday, 25 January 2013 23:05:26 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:21:04 GMT