Re: [csswg-drafts] [selectors-4] Clarify :lang() behavior when the language range is not a well-formed BCP 47 code (#8720)

@felipeerias, I tend to agree: garbage into the match algorithm produces the right level of not matching without the need for additional work by the spec or implementers. Your tests look good to me.

(Note: The text you quoted would forbid the range `fr-x` from matching because `fr-x` is not a well-formed tag. This is an error in the text I suggested.)

RFC4647 extended filtering does **not** require that implementations check whether language **_tags_** are well-formed (and certainly not valid). An implementation therefore would tend to make all three of your examples "match", even though the first and third example's language tags are not well-formed. However, CSS has chosen to canonicalize language tags and this implies it would fail non-well-formed tags:

> Both the content language and the language range must be canonicalized and converted to extlang form as per section 4.5 of [[RFC5646]](https://www.w3.org/TR/selectors-4/#biblio-rfc5646) prior to the extended filtering operation.

This requirement effectively rejects not only badly formed ranges (`en=fubar`, `åå`, `fr-ninechars`), but also badly formed tags/ranges (`de-de-latn`, `*-10-oboe`) and probably invalid tags.

Normally, extended filtering does not prevent non-well-formed tags from being matched. Nor does it prevent non-well-formed (as language tags) ranges from matching tags (including both well-formed and non-well-formed ones).

Upon reflection, I think this additional well-formedness/validity requirement text should go away. But the implications of the canonicalization step need to be spelled out. Perhaps:

> Both the content language and the language range must be canonicalized and converted to extlang form as per section 4.5 of [[RFC5646]](https://www.w3.org/TR/selectors-4/#biblio-rfc5646) prior to the extended filtering operation. Language tags that are not valid cannot be matched and language ranges that cannot be canonicalized do not match anything.




-- 
GitHub Notification of comment by aphillips
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/8720#issuecomment-4314662241 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Friday, 24 April 2026 16:19:07 UTC