Re: [csswg-drafts] [css-syntax] custom property names too permissive, require namespacing rules (#7129)

The CSS Working Group just discussed `Custom property names too permissive`, and agreed to the following:

* `RESOLVED: Use HTML restrictions for custom idents`
* `RESOLVED: illegal characters in an ident can be escaped`
* `RESOLVED: Invalid ident characters are treated as DELIM tokens`

<details><summary>The full IRC log of that discussion</summary>
&lt;fantasai> Topic: Custom property names too permissive<br>
&lt;fantasai> github: https://github.com/w3c/csswg-drafts/issues/7129<br>
&lt;fantasai> TabAtkins: i18nWG raised issue about custom idents, which allow any Unicode codepoint above a certain codepoint<br>
&lt;fantasai> TabAtkins: There are some concerns about e.g. bidi characters corrupting the display of the code<br>
&lt;fantasai> TabAtkins: Also argument for consistency in what characters allowed across languages<br>
&lt;fantasai> TabAtkins: JS follows UAX?? rules for characters allowed in idents<br>
&lt;fantasai> TabAtkins: HTML allows a different but largely compatible range of characters<br>
&lt;fantasai> TabAtkins: In one of my Tweets, I showed off using weird Unicode rules<br>
&lt;fantasai> TabAtkins: e.g. different emoji are valid or invalid<br>
&lt;fantasai> TabAtkins: I agree with i18n feedback, reasonable to partially restrict these<br>
&lt;fantasai> TabAtkins: e.g. no reason to allow bidi override chars in CSS idents<br>
&lt;fantasai> TabAtkins: so I suggest adopting either HTML rules or JS rules<br>
&lt;Rossen_> q?<br>
&lt;fantasai> TabAtkins: don't have a strong opinion on which to go for<br>
&lt;fantasai> TabAtkins: Otherwise I'd go with HTML rules by default<br>
&lt;emilio> Scribenick: emilio<br>
&lt;emilio> fantasai: I think this is fairly reasonable, but I don't know the differences between the rules so I don't have an opinion on those yet<br>
&lt;fantasai> TabAtkins: JS rules are a bit more strict, they disallow chars that look like punctuation<br>
&lt;fantasai> TabAtkins: HTML gives exact codepoint ranges<br>
&lt;fantasai> TabAtkins: Reason I'd go with HTML is to guarantee being able to write selectors for custom elements, without ever having to escape<br>
&lt;Rossen_> ack fantasai<br>
&lt;fantasai> fantasai: That sounds reasonable, let's go with that<br>
&lt;fantasai> Rossen_: Makes sense, any downsides to it?<br>
&lt;fantasai> TabAtkins: Any change to make more restrictive, could potentially make some stylesheets invalid<br>
&lt;fantasai> TabAtkins: potentially breaking code that works<br>
&lt;fantasai> Rossen_: And with HTML rules we'd have fewer breakage<br>
&lt;fantasai> Rossen_: seems like path of least destruction<br>
&lt;fantasai> Rossen_: Anyone would like to argue against the change entirely?<br>
&lt;fantasai> Rossen_: If not any objections?<br>
&lt;fantasai> Rossen_: Taking the silence as a no<br>
&lt;fantasai> RESOLVED: Use HTML restrictions for custom idents<br>
&lt;fantasai> TabAtkins: Got 2 sub-issues<br>
&lt;fantasai> TabAtkins: One is whether to allow illegal characters to be escaped in an identifier<br>
&lt;fantasai> TabAtkins: JS doesn't allow that, you can escape for readability but not to avoid the identifier restrictions<br>
&lt;fantasai> TabAtkins: but CSS has traditionally always allowed escapes for everything, so don't see a strong reason to disallow<br>
&lt;faceless> +1 from us too<br>
&lt;fantasai> TabAtkins: So I would prefer to go with illegal chars can be escaped<br>
&lt;fantasai> fantasai: I strongly agree with that<br>
&lt;fantasai> Rossen_: Any objections for allowing illegal characters to be escaped in an ident?<br>
&lt;fantasai> RESOLVED: illegal characters in an ident can be escaped<br>
&lt;fantasai> TabAtkins: Next question is how do we handle the illegal characters<br>
&lt;dbaron> That doesn't allow nulls in idents, does it?<br>
&lt;fantasai> TabAtkins: Do we censor them into e.g. U+FFFD<br>
&lt;fantasai> TabAtkins: or drop them entirely?<br>
&lt;fantasai> TabAtkins: I'd prefer to drop them, because it would more clearly result in invalid code<br>
&lt;fantasai> TabAtkins: so if we allow to work but censored it wouldn't prevent use in source text, which was the goal of i18n<br>
&lt;fantasai> TabAtkins: so would prefer to exclude from the ident production<br>
&lt;fantasai> &lt;fantasai> +1<br>
&lt;tantek> +1 TabAtkins<br>
&lt;fantasai> Rossen_: [missed]<br>
&lt;fantasai> TabAtkins: No, would not be changing existing rules for censoring rules. Currently lone surrogates etc. do that<br>
&lt;fantasai> TabAtkins: Those are in there for UTF-8 well-formedness and C compatibility<br>
&lt;fantasai> TabAtkins: They have a reason to be censored out at technical low level<br>
&lt;fantasai> TabAtkins: these restrictions are for human reasons, so would restrict differently<br>
&lt;Rossen_> ack fantasai<br>
&lt;fantasai> fantasai: So should we resolve that they would make the production invalid? (That's what was proposed right?)<br>
&lt;TabAtkins> --(╯°□°)╯<br>
&lt;fantasai> TabAtkins: yes<br>
&lt;fantasai> TabAtkins: if you put this ^ as a custom property name, the degree sign is not a valid character<br>
&lt;fantasai> TabAtkins: so it would make an ident, a delim, a parenthesis, and a ???<br>
&lt;fantasai> TabAtkins: That's definitely not an ident, because it's multiple tokens not an ident token<br>
&lt;bmathwig> Is there a practical use case for doing something like that? Seems more like a developer having fun rather than good quality code.<br>
&lt;fantasai> TabAtkins: Proposed resolution is that it would break into multiple tokens<br>
&lt;fantasai> fantasai: What kind of token are these invalid characters going to be?<br>
&lt;fantasai> TabAtkins: DELIMs, one codepoint at a time<br>
&lt;fantasai> TabAtkins: Characters without a specific role are generally handled as DELIM<br>
&lt;fantasai> TabAtkins: and we only use certain DELIMs in certain places<br>
&lt;TabAtkins> the degree sign isn't a valid ident char under the HTML rules, so this would produce an ident, a delim containing the degree sign, an ident, a delim, and finally an ident<br>
&lt;fantasai> RESOLVED: Invalid ident characters are treated as DELIM tokens<br>
&lt;faceless> present-<br>
</details>


-- 
GitHub Notification of comment by css-meeting-bot
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/7129#issuecomment-1069331981 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 16 March 2022 16:57:46 UTC