W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2006

[whatwg] several messages about XML syntax and HTML5

From: Michel Fortin <michel.fortin@michelf.com>
Date: Mon, 4 Dec 2006 20:05:08 -0500
Message-ID: <C2798B1D-FC04-4060-8AFC-AF0D15301285@michelf.com>
Le 4 d?c. 2006 ? 14:33, Martin Atkins a ?crit :

> Likewise, the content model of the <script> element is "hardcoded"  
> into the parser; there's no way to discover it from the syntax  
> alone. (I'll admit that there's no similar construct to the content  
> model of <script> in XML, however, so this particular difference  
> doesn't pose a problem.)
>
> In order to handle custom elements in HTML while still allowing  
> them to appear in the DOM, you'd have to make some rules such as  
> that no void elements are allowed. You'd have to write otherwise- 
> void elements as, say, <img></img> in order to have them handled  
> correctly by the parser.

It's interesting you mention <script>. If we want some sort of XML  
data island, we could use something like this:

<script type="text/xml">
   <xml-content/>
</script>

Then, after the content of <script> has been gathered, the browser  
could parse it as actual XML, stopping at the first parse error. You  
could even use JavaScript to gather the text from the DOM, parse the  
XML and create the DOM tree accordingly since the text content of the  
script is available in the DOM. The only requirement would be that  
the XML content does not include any "</script>" itself (See note at  
the end).

I'm not sure if this is a plus or not, but it also seem that this  
syntax is supported by Internet Explorer's data islands[1], although  
I assume IE uses its own special parser mode for this. (But I'm just  
guessing here.)

So that's just an idea. I can't say I'm fond of XML data islands  
myself, nor of the idea of overloading <script> for this, but I think  
this method has the merit of being relatively simple to implement.

Note:
     Practically it seems to work with the parser, but it isn't valid
     since the spec says this about elements such as <script>:

     > CDATA elements can have text, but the text must not contain the
     > two character sequence "</".

     So for that to be conformant, this paragraph would need to be  
revised.

[1]: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ 
xmlsdk/html/eb7a2b76-49e9-424c-aa5a-d3cbeeb745e3.asp


Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/
Received on Monday, 4 December 2006 17:05:08 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:31 UTC