- From: Domenic Denicola <notifications@github.com>
- Date: Fri, 07 Jan 2022 08:22:14 -0800
- To: whatwg/dom <dom@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/dom/issues/849/1007541209@github.com>
Here is an initial stab at maximally-lenient rules that I think work. Someone double-checking would be great; if I can get confirmation that they seem right, I can probably spend some time on a spec PR. - For element local names: - LenientElementNameStartChar := same as existing [NameStartChar](https://www.w3.org/TR/xml/#NT-NameStartChar). (Parser [only switches to tag name stage if given ASCII alpha as first character](https://html.spec.whatwg.org/#tag-open-state), so NameStartChar is more lenient than the parser.) - LenientElementNameChar := anything exept tab, LF, FF, space, /, >, NULL. (This appears to be [what the parser accepts in the tag name state](https://html.spec.whatwg.org/#tag-name-state). NameChar also disallows all of these. The parser will lowercase ASCII upper alphas but we cannot do this in DOM APIs.) - For element qualified names: - Get rid of existing validate step that uses QName. - Strictly split on : - Validate resulting localName per above rules - Validate resulting prefix via [Prefix](https://www.w3.org/TR/xml-names/#NT-Prefix), i.e. existing rules. (The parser does not ever create elements with prefixes so no need to make this more lenient.) - For attribute local names: - LenientAttributeNameStartChar := anything except tab, LF, FF, space, /, >, NULL. ([Relevant parser spec](https://html.spec.whatwg.org/#before-attribute-name-state). NameStartChar also disallows all of these. The parser will lowercase ASCII upper alphas but we cannot do this in DOM APIs.) - LenientAttributeNameChar := LenientAttributeNameStartChar but also exclude = - For attribute qualified names: - Similar formula as for element qualified names: strictly split on :, validate resulting localName per above rules, validating resulting prefix per existing `Prefix` production. - The parser only creates attributes with a [small set of lowercase-ASCII prefixes](https://html.spec.whatwg.org/#adjust-foreign-attributes) so no need to make Prefix more lenient here either. Probably we should not touch custom element name rules. We could in theory make [PCENChar](https://html.spec.whatwg.org/multipage/custom-elements.html#prod-pcenchar) similarly lenient to LenientNameChar, but I'm not sure that leniency actually is a good idea for them, since `customElements.define()` basically gives us a single location at which to enforce good naming practices and, if you pass them, grant you custom element powers. It's not like the situation with parser-created vs. API-created. Although I've phrased the above in terms of hypothetical grammar productions (e.g. LenientElementNameStartChar) the actual spec would probably be better as algorithms that loop over code units/code points, since that is how they're implemented. And per the OP of this thread the current implementations have bugs, which I suspect might be due to the attempt at translating from grammar specifications into algorithms. -- Reply to this email directly or view it on GitHub: https://github.com/whatwg/dom/issues/849#issuecomment-1007541209 You are receiving this because you are subscribed to this thread. Message ID: <whatwg/dom/issues/849/1007541209@github.com>
Received on Friday, 7 January 2022 16:22:27 UTC