- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Mon, 23 Oct 2006 14:43:40 +0300
On Oct 18, 2006, at 01:27, ?istein E. Andersen wrote: > I just tried to check out how custom element and attribute names > work in current browsers and how they are supposed to work in > HTML5, and some issues seem unclear to me. ... > 4) Sivonen's HTML5 validator (http://hsivonen.iki.fi/validator/ > html5/ as opposed to http://hsivonen.iki.fi/validator/) says: >> Attribute name must not start with ?xml?. > I fail to find any mention of this in the HTML5 draft. Has it been > borrowed from X(HT)ML? > Such a limitation would make it more difficult to create conformant > legacy-comptaible documents. Any attribute or element not specifically allowed in the spec is non- conforming. Therefore, all "custom attributes" and "custom elements" are non-conforming. Some non-conforming attributes are caught in the parser. Others are caught on the RELAX NG level. This is an implementation detail. The implementation detail becomes an issue only if you want to use the conformance checker machinery with a custom schema. Using custom schemas with the HTML parser is for experts only and produces very wrong results unless the schema is suitable. Hence, I have not optimized for that use case. Please note that the parser is not a conforming HTML5 parser but a special-purpose parser that is designed to work together with particular RELAX NG schemas for the specific purpose of conformance checking. > 5) The same validator does not allow : or ? in either element or > attribute names, whereas the current HTML5 draft seems to allow all > Unicode characters except whitespace, <, >, = and /. Would someone > please clarify this? *Conforming* element names and attributes happen to consist of ASCII- only name tokens without a colon. As an implementation detail, names that do not have such a form are caught early by the special-purpose parser. This is done in order to 1) prevent colonified names from entering into the namespace-aware SAX pipeline 2) deal with case folding efficiently and in a way that prevents accidentally folding e.g. ?NPUT to input 3) prevent names that are not well-formed XML names from entering into the SAX pipeline > 6) According to the current draft, authors seem to have the > possibility to use custom element and attribute names of their choice. Could you please cite the part of the spec that says so? Such usage wasn't *conforming* when I last checked (a few months ago). Has the spec changed in a dramatic way when I wasn't looking? Note that not everything that results in a DOM according to the parsing algorithm is conforming. The conformance checker is foremost for checking conformance. Supporting custom schemas for privately extended HTML5-like languages is a nice feature to have, but personally I am not at all sympathetic to extending HTML5 with names that contain non-ASCII (due to case folding issues), non-XML characters (due to XML serializability issues) or the colon (due to Namespaces in XML compatibility issues). -- Henri Sivonen hsivonen at iki.fi http://hsivonen.iki.fi/
Received on Monday, 23 October 2006 04:43:40 UTC