- From: Norman Walsh <ndw@nwalsh.com>
- Date: Tue, 28 Jun 2011 11:34:20 -0400
- To: public-html-xml@w3.org
- Message-ID: <m2tybadl6r.fsf@nwalsh.com>
Henri Sivonen <hsivonen@iki.fi> writes: > I think the paragraph > > 'The HTML5 parser will be able to parse the XML so in principle there is > no parsing problem except that the parser will build a DOM according to > the HTML5 parsing rules. There may be be issues with namespaces, but > this may "just work" for many scenarios even if the the result would not > meet the expectations for those who appreciate the full power of XML.' > > gives the wrong idea of what happens. I think the foremost issue isn't > different namespaces but <foo/> getting parsed as a start tag most of > the time and name collisions with certain HTML elements causing > interesting effects. > > I suggest removing the paragraph. Ok. > s/These rules will not always produce the same DOM that an XML parser > would have produced./There rules with most often produce a DOM that is > substantially different from the DOM that an XML parser would have > produced./ I think I've addressed that in the course of implementing Robin's comments, but please let me know if you disagree. > This paragraph is strange: > "There are still details of implementation to be considered in the case > where HTML5 is represented with well-formed XML. Is the markup to be > “clipped out” and handed to an HTML5 parser, or is the entire XML DOM > going to be handed to the HTML5 engine?" > > If there's an XHTML5 subtree in a larger XML document, involving an HTML > parser would be the least preferred interface to an (X)HTML5 subsystem. > I think the most preferable would be passing a fragment of the app's > internal data model (e.g. an in-memory tree) to the subsystem. If the > subsystem interface wants a serialization and can ingest XHTML5, it > would make more sense to use XHTML5 than HTML5 at the boundary to avoid > the cases where some tree shapes don't round trip through the HTML > serialization. > > I think this paragraph steps outside the stated use case, since it > introduces a container outside the XML document: > "A third solution is to process the compound messages using MIME > multipart/related semantics, perhaps through facilities such as [MTOM] > or [XOP]. This is very much like the escaped markup case where > downstream processing must be sophisticated enough to reconstruct the > authors intent." > If this solution is to be mentioned, it would make sense to mention .zip > instead of the more esoteric archive formats. These two paragraphs are my attempt to incorporate text that we reviewed in the use case: http://www.w3.org/wiki/HTML_XML_Use_Case_03 I'm a little reluctant to remove them without reviewing the use case document first. These feel like substantive disagreements to a use case that I thought we'd reached consensus about. > This is incorrect: > "What the HTML5 parser produces when it processes this script element is > a script element node in the DOM which contains the escaped character > representation of the XML." > The text node content of the script node is not escaped. It is ready to > be used as input to an XML parser. Fixed. Be seeing you, norm -- Norman Walsh Lead Engineer MarkLogic Corporation Phone: +1 413 624 6676 www.marklogic.com
Received on Tuesday, 28 June 2011 15:34:59 UTC