Re: [csswg-drafts] [selectors-4] Clarify :lang() behavior when the language range is not a well-formed BCP 47 code (#8720)

@svgeesus @aphillips Yes, I recently updated those tests as part of the ongoing Chromium implementation.

My understanding is that:

- extended language ranges in the `:lang()` selector follow the syntax in [RFC4647 2.2](https://datatracker.ietf.org/doc/html/rfc4647#section-2.2);
- content language values (e.g. the `lang` attribute) follow "BCP 47" syntax, which here means specifically [RFC5646 2.1](https://datatracker.ietf.org/doc/html/rfc5646#section-2.1);
- matching follows the extended filtering algorithm in [RFC4647 3.3.2](https://datatracker.ietf.org/doc/html/rfc4647#section-3.3.2).

Values that are not well-formed according to their respective syntax would then never match, which would resolve this issue.

If that's the intention of the spec, then this sentence in the current text adds a contradictory restriction:

> Language ranges that are not well-formed language tags or which would not be a well-formed language tag if an initial wildcard character "*" were replaced with a valid subtag, do not match anything.

I am not sure that this restriction is needed.

For example, `fr-x` is a valid language range (RFC4647) but an invalid language tag (RFC5646). Following the points above:

- `:lang(fr-x)` would **not** match an element with `lang="fr-x"` (invalid tag)
- `:lang(fr-x)` would match an element with `lang="fr-x-standard"` (valid tag)
- `:lang(fr-x)` would **not** match an element with `lang="fr-x-ninechars"` (invalid tag, subtag too long)


-- 
GitHub Notification of comment by felipeerias
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/8720#issuecomment-4313197993 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Friday, 24 April 2026 12:37:37 UTC