- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Mon, 11 Apr 2011 15:52:01 +0300
- To: public-html-xml@w3.org
On Tue, 2011-03-22 at 14:50 -0400, Norman Walsh wrote: > I've just published the first draft of any actual substance: > > http://www.w3.org/2010/html-xml/snapshot/report.html > > I encourage you to review it and send your comments to this list. Thanks and sorry about the slow review. Some notes: s/principle/principal/ s/impedement/impediment/ s/gauranteed/guaranteed/ s/internet/Internet/ I think the paragraph 'The HTML5 parser will be able to parse the XML so in principle there is no parsing problem except that the parser will build a DOM according to the HTML5 parsing rules. There may be be issues with namespaces, but this may "just work" for many scenarios even if the the result would not meet the expectations for those who appreciate the full power of XML.' gives the wrong idea of what happens. I think the foremost issue isn't different namespaces but <foo/> getting parsed as a start tag most of the time and name collisions with certain HTML elements causing interesting effects. I suggest removing the paragraph. s/These rules will not always produce the same DOM that an XML parser would have produced./There rules with most often produce a DOM that is substantially different from the DOM that an XML parser would have produced./ This paragraph is strange: "There are still details of implementation to be considered in the case where HTML5 is represented with well-formed XML. Is the markup to be “clipped out” and handed to an HTML5 parser, or is the entire XML DOM going to be handed to the HTML5 engine?" If there's an XHTML5 subtree in a larger XML document, involving an HTML parser would be the least preferred interface to an (X)HTML5 subsystem. I think the most preferable would be passing a fragment of the app's internal data model (e.g. an in-memory tree) to the subsystem. If the subsystem interface wants a serialization and can ingest XHTML5, it would make more sense to use XHTML5 than HTML5 at the boundary to avoid the cases where some tree shapes don't round trip through the HTML serialization. I think this paragraph steps outside the stated use case, since it introduces a container outside the XML document: "A third solution is to process the compound messages using MIME multipart/related semantics, perhaps through facilities such as [MTOM] or [XOP]. This is very much like the escaped markup case where downstream processing must be sophisticated enough to reconstruct the authors intent." If this solution is to be mentioned, it would make sense to mention .zip instead of the more esoteric archive formats. This is incorrect: "What the HTML5 parser produces when it processes this script element is a script element node in the DOM which contains the escaped character representation of the XML." The text node content of the script node is not escaped. It is ready to be used as input to an XML parser. Historical side note: This technique has been documented in the /TR/ space since 1998! http://www.w3.org/TR/NOTE-xh#script-hack > If it looks like we need to talk about any of them, I'll > probably schedule a telcon for 12 April. I can't make it to the potential telecon tomorrow. My regrets. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Monday, 11 April 2011 12:52:34 UTC