- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Sat, 4 Nov 2006 12:02:54 +0200
On Nov 3, 2006, at 23:40, ?istein E. Andersen wrote: > On 31 Oct 2006, at 11:46AM, Henri Sivonen wrote: > >> If you add custom *elements* and use the HTML parser, the system >> does not >> ensure that the custom elements would not adversely interact with >> tag inference >> or error handling in browsers. [...] If you add custom elements, >> you just have to >> know what you are doing in order to keep the results useful for >> the purpose of >> authoring for browsers. > > My idea was to allow custom elements only in contexts where such > problems do not occur. Right, but currently you have to know what those contexts are. > I was therefore surprised to realise that the current HTML5 draft > seems to allow any > character except \0. Unless I have missed something, the character > repertoire > should probably be restricted somewhat, possibly to the common > subset allowed > in both HTML 4.01 and XML 1.0'. I think conforming text/html documents should not be allowed to parse into a DOM that contains characters that are not allowed in XML 1.0. However, it is still necessary to decide what to do with non- conforming cases in browsers. I am inclined to prefer replacement with U+FFFD over putting stuff in the DOM that could not end up in there by parsing an XML 1.0 document. But of course, there's the issue of what existing browsers do already. :-/ >> Actually, I should have said that the minimum condition that I >> think is >> necessary for a name of a custom attribute or element to be >> reasonable is that >> the name matches the NCName production from Namespaces in XML 1.0 > > I agree with the intention that names should be restricted to > (mostly) letters and > digits, and this is probably the only usable definition we will get > any time soon. It's not because I want to restrict names to letters and digits. I don't. It is for compatibility with the XML 1.0 serialization. I do want conforming HTML5 documents to have an XHTML5 serialization. >> and only contains characters from the Basic Latin (ASCII) block. > > Is this because of case-folding issues? Yes. -- Henri Sivonen hsivonen at iki.fi http://hsivonen.iki.fi/
Received on Saturday, 4 November 2006 02:02:54 UTC