Re: Problem with element selectors having unicode characters from Yves Lafon on 2008-05-28 (www-validator-css@w3.org from May 2008)

From: Yves Lafon <ylafon@w3.org>
Date: Wed, 28 May 2008 11:08:47 -0400 (EDT)
To: Radu Coravu <radu_coravu@sync.ro>
Cc: www-validator-css@w3.org
Message-ID: <Pine.LNX.4.64.0805281102410.17698@ubzre.j3.bet>

On Mon, 26 May 2008, Radu Coravu wrote:

> Hi,
>
> The CSSParser.jj file declares at line 417 NONASCII as ["\200"-"\377"] That 
> does not comprise the whole UNICODE range.
> It should be something like: ~["\000"-"\177"] This means all non-ASCII.
> I attached 2 sample files. The CSS contains a Japanese character in a 
> selector and is marked as invalid by you but a browser has no problems 
> matching the CSS selector to the element name.
> Any input on this one?

http://www.w3.org/TR/CSS21/grammar.html#scanner
Also:
in 4.1.3:
In CSS, identifiers (including element names, classes, and IDs in
        selectors) can contain only the characters [a-z0-9] and ISO 10646
        characters U+00A1 and higher, plus the hyphen (-) and the
        underscore (_); they cannot start with a digit, or a hyphen
        followed by a digit. Identifiers can also contain escaped
        characters and any ISO 10646 character as a numeric code (see next
        item). For instance, the identifier "B&W?" may be written as
        "B\&W\?" or "B\26 W\3F".
        Note that Unicode is code-by-code equivalent to ISO 10646 (see
        [UNICODE] and [ISO10646]).

So the validator tries to stick as much as possible to the definition 
given by the spec.
Cheers,

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves

Received on Wednesday, 28 May 2008 15:09:22 UTC