Re: [webcomponents] [Custom]: "are" custom element names ASCII characters, or MUST they be ASCII characters? (#239)

> Per the HTML parser a tag name has to start with `[a-z][A-Z]`. However, once you get to the "tag name state", anything goes, except for ASCII whitespace, "/", ">", and U+0000. 

That requirement doesn't exist in the XML parser so I'm inclined to say we should get rid of that requirement in the XML documents because it really doesn't meet the author expectation in non-European languages.  This should be an important consideration in the parser extensibility issue #113.

Now, irrespective of HTML or XML documents, it doesn't make any sense to require `-` in the tag name when the tag name contains non-ASCII letters since there is no conceivable way that would become a forward compatibility problem with the future HTML specifications.

Again, my preference would be to require ASCII lowercase letters for the entire tag name in v1, and extend it carefully in the future.  Since, in practice, even authors in Japan, China, etc... are going to use alphanumerical tag names in HTML documents to be consistent with other builtin elements.

Having said all those things, I have see two sensible options:
 1. Require that all characters in a custom element tag name to be ASCII lowercase.
 2. Define a strict subset of what `document.createElement`, HTML parser, and XML parser support, and then require a custom element tag name consists of only those names with a leading ASCII character with an additional requirement that `-` be present when the tag name only contains alphanumeric letters.

---
Reply to this email directly or view it on GitHub:
https://github.com/w3c/webcomponents/issues/239#issuecomment-190879220

Received on Tuesday, 1 March 2016 20:01:46 UTC