Re: [csswg-drafts] [selectors] is #42 a valid ID selector? from ericrannaud via GitHub on 2016-06-22 (public-css-archive@w3.org from June 2016)

From: ericrannaud via GitHub <sysbot+gh@w3.org>
Date: Wed, 22 Jun 2016 00:32:19 +0000
To: public-css-archive@w3.org
Message-ID: <issue_comment.created-227611883-1466555537-sysbot+gh@w3.org>

Shouldn't the CSS and HTML specs agree on the form of an ID?

To quote from the current draft 
(https://github.com/w3c/html/blob/master/sections/dom.include):

> There are no other restrictions on what form an ID can take; in 
particular, IDs can consist of
> just digits, start with a digit, start with an underscore, consist 
of just punctuation, etc.

I understand that one spec talks about the form of ID _attribute 
values_ and the other spec defines the ID _selector syntax_, and these
 are two different things.

And I understand that you can escape the ID `42` to get a valid ID 
selector `#\x34\x32` (Is that correct? Neither Chrome nor Firefox 
accept this).

However, developers routinely use simple string operations to build 
selectors from attribute values. To be safe, assuming the above is 
true, they would have to always try and escape ID attribute values if 
they start with a digit. But no one ever does that, or is aware they 
need to.

jQuery and Sprint actually check the form of the selector, and have a 
partial workaround. If the selector matches `/^#[\w-]+$/`, then they 
use `document.getElementById(selector.substring(1))` rather than 
`document.querySelectorAll(selector)`. This works for:

    $("#42")

But this doesn't work for:

    $("#42.test")

because they don't recognize it. Admittedly, ".test" is redundant if 
IDs are unique in the document, but the point remains that building 
selectors with string operations is much more complicated if one needs
 to care whether we're handling ID values or not.

With the following document, with non-unique IDs:

    <div id=42>A</div>
    <div id=B>
      <div id=42>B</div>
    </div>

jQuery is actually able to successfully execute `$("#B").find("#42")`,
 but only by doing a lot of slow manual work. The call to 
`$b.find("#42")` follows this sequence:

1. They first try `result = document.getElementById("42")`, but check 
if `result` is contained within `$b`, here it's not;
2. Try to call `querySelectorAll("#42")` on the Element in `$b` to 
restrict the search to its descendants, but that fails because the 
selector is not valid;
3. Manually go through the descendants of the Element in `$b` using a 
custom matcher built by parsing the selector (note that jQuery's CSS 
selector parser accepts `#42`...).

Sprint (a jQuery alternative) is not that sophisticated and fails.

If Element had a `getElementById()` method, then it wouldn't be so 
bad. Right now (maybe because of Firefox and Chrome bugs), there is no
 way to resolve `$b.find("#42")` using fast, native browser methods.

Are there fundamental objections to relaxing the ID selector syntax so
 that `"#" + element.id` be always a valid ID selector when working 
with an HTML5 document?

-- 
GitHub Notification of comment by ericrannaud
Please view or discuss this issue at 
https://github.com/w3c/csswg-drafts/issues/202#issuecomment-227611883 
using your GitHub account

Received on Wednesday, 22 June 2016 00:32:23 UTC