- From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
- Date: Fri, 31 Dec 2010 12:09:07 +0000
- To: Norman Walsh <ndw@nwalsh.com>
- Cc: public-html-xml@w3.org
On Thu, Dec 30, 2010 at 9:19 PM, Norman Walsh <ndw@nwalsh.com> wrote: > 1. I have an XML toolchain and I want to consume HTML5 because I'd > like to process HTML5 using XML tools. [snip] > HTML5 > parsers that produce a stream of well-formed events suitable for > constructing XML already exist, so this looks like a mostly solved > parsing problem. [snip] > It may be necessary/useful/convenient to shuffle > namespaces a bit in the parsed content, for example to put SVG and > MathML back in their respective namespaces so that your existing XML > tools will do the right thing. For conforming HTML5 markup, where does the text/html parsing algorithm not already put SVG and MathML in their respective namespaces? > 2. I have an HTML5 toolchain and I want to consume XML because I'd > like to process XML using HTML5 tools. Why would one want to do this? The HTML5 parsing algorithm has been tuned for parsing text/html content found in the wild, not arbitrary malformed XML like one finds in feeds and publisher-side data exchange. If you want to consume wild malformed XML, aren't you better off with a XML5-like parser? If you want to consume well-formed XML, aren't you better off with a normal XML parser? > A simpler subset of XML might be created to make life easier for the > cases that would be covered by such a subset. Unless the HTML5 algorithm is changed it would still end up in the wrong namespace, no? > 3. I have an XML document and I want to embed islands of human prose > marked up with HTML5 in it because I want to be able to extract > those sections for use in, for example, documentation. > > If you expect the document to remain well-formed XML, you'll have to > author with XHTML5 and then there won't be any parsing problems. Another option is to roundtrip HTML in CDATA like Atom feeds. > 4. I have an HTML5 document and I want to embed islands of XML in it > because I want to be able to write JavaScript and CSS to manipulate > those elements, for example, in the browser. Can you elaborate on this use case? What are we really talking about and why? What are some example end-user problems this would solve? Might there be other (better?) ways to solve them? By "islands of XML" do we mean round-tripping information in XML for clientside processing? Or do we mean a text/html document that contains a mixture of HTML/MathML/SVG semantics and elements with other arbitrary semantics? There's a big difference between the two. Round-tripping /information/ that could be expressed in XML can already be done using RDFa or microdata annotations on top of generically understood HTML/SVG/MathML semantics. Round-tripping a blob of XML can be accomplished unescaped with the "script" element with the single restriction that content cannot contain the string "</script>" (case insensitive) or HTML escaped inside a data-* attribute, param value attribute, or input type="hidden" value attribute. (It's also done in the wild with comments, with the restriction it cannot contain the string "--".) On the other hand, including arbitrary markup inside a text/html document would damage the RESTful architecture of the web because the media type text/html could no longer be understood in terms of generic semantics like "h1", "mtext", and "rect" (defined by HTML5, MathML, and SVG respectively). Authors would presumably break separation of concerns by trying to hack in functionality and presentation using CSS and JS and (if users were lucky, which they usually aren't) patch up accessibility with WAI-ARIA: making content and functionality less robust (greater risk of intranet security prohibitions, network failures, coding errors, varying levels of implementation support), preventing users skinning content to suit their needs and preferences, and forcing users to put themselves at risk by executing untrusted code just to gain access to basic content and functionality. Sell me on why a standards organization for committed to delivering end-users an interoperable, accessible, skinnable, safe web experience would want to support such usage. -- Benjamin Hawkes-Lewis
Received on Friday, 31 December 2010 12:09:42 UTC