- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 27 May 2009 18:12:47 +0000 (UTC)
- To: Manu Sporny +ADw-msporny+AEA-digitalbazaar.com+AD4
- Cc: Toby Inkster +ADw-tai+AEA-g5n.co.uk+AD4, RDFa mailing list +ADw-public-rdf-in-xhtml-tf+AEA-w3.org+AD4, HTMLWG WG +ADw-public-html+AEA-w3.org+AD4
On Tue, 26 May 2009, Manu Sporny wrote: > > I don't believe that there is any such thing as an malformed XMLLiteral > in HTML5... is there? Can anybody think of an example of an invalid > XMLLiteral in an html5 parser? If you're asking if an HTML5 parser can generate a DOM that cannot be serialised as XML, the answer is yes, there are a number of ways to do this. The easiest way is for the text/html source to have an element or attribute with a colon in it, as in: <html foo:bar> Another possibility is a comment with two consecutive dashes: <!-- -- --> Another example would be a form feed character (U+000C). For example, if a plain text RFC is parsed as text/html, the resulting DOM would contain U+000C characters that cannot be converted to XML. If scripts have been able to mutate the DOM, there are even more ways for the DOM to not be serialisable. These issues are discussed in two places in HTML5. One is the rules for coercing an HTML DOM to an Infoset: http://www.whatwg.org/specs/web-apps/current-work/#coercing-an-html-dom-into-an-infoset The other is the rules for serialising a DOM to XML: http://www.whatwg.org/specs/web-apps/current-work/#serializing-xhtml-fragments HTH, -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 27 May 2009 18:13:44 UTC