- From: Shelley Powers <shelleyp@burningbird.net>
- Date: Sun, 06 Sep 2009 15:27:42 -0500
- To: public-html@w3.org
I had an interesting twitter exchange with Henri[1], about the validator.nu's handling of an HTML5 document with inline SVG. I have two documents, both HTML5, one served up as HTML[2], the other as application/xhtml+xml[3]. The HTML document throws several errors in validator.nu. One is the presence of an SVG element in a paragraph. According to Henri, this error came about because it's some form of warning since no browser currently supports SVG in HTML. However, the Firefox nightly will support SVG in HTML5, if you set the html5.enabled configuration option to true. Regardless, this isn't an error, but it's also not what concerns me. The SVG is valid SVG/XML, copied as one would find SVG in the wild. It references several vocabularies with given namespaces, including Dublin Core, Creative Commons, etc. All of the RDF annotation is within an SVG metadata element. Again, nothing in what I described is unusual. Henri's validator ignores the RDF/XML in the XHTML document, which is fine. But the validator throws several errors related to the RDF/XML in the HTML document. When I asked him about it in Twitter, he responded with, "Also, the dc:foo stuff is not even supposed to be valid in text/html". Yet there's nothing that I could find in the HTML5 specification that states this. More importantly, this is a HTML5 failure in waiting, because if people inline SVG, chances are they will inline whatever SVG they find in the wild, which may or may not include RDF/XML. Validly include, may I add, in fact recommended when it comes to annotating Creative Commons license info. When it comes to parsing the page contents, the HTML5 specification does reference two documents, the HTMLDocument, and the SVGDocument, and states that when the SVG element is parsed, its contents are added to the SVGDocument. In XHTML, currently, the inline SVG would be merged into the overall page Document, for a combined model. It sounds like the same is happening with HTML in HTML5, except that there's nothing in the spec that discusses what happens if the SVG contains valid XML from other vocabularies. It doesn't discuss what happens with the namespaces within the SVG, and how these would be handled in the DOM. I had assumed that the SVG would be turned over to the SVG parsing engine for the browser, which would operate the same regardless of whether the SVG is in an HTML document, or an XHTML document. However, as Lachlan just noted[4], there's a significant gap in understanding about how SVG/XML is handled in XHTML as compared to HTML. We know what's supposed to happen if crappy markup gets embedded in SVG: the user agents are supposed to provide a facility that allows the user to extract valid XML, which means they will have to correct the crappy markup. It's not an elegant solution, since people will probably try copy and paste of the SVG from source, which means they could be copying and pasting crapping markup. But there's nothing definitive that I could find (I could have easily missed it), about well formed SVG/XML in HTML, and what happens from both a DOM POV and a validator POV. My first inclination was to file a bug about the lack of specificity, but I thought I would first post a note to the group, to see if I had missed sections in the spec that explain this, and to see other point of views on how the SVG/XML should be handled. I decided to see what happens with a browser that actually supports HTML5 with inline SVG. I downloaded the latest Firefox nightly and enabled HTML5. I then added script to both the HTML and the XHTML versions of the files that will access the red SVG circle in the document, which doesn't uses the default SVG namespace, and also access all dc:title elements. I then printed out several key values related to HTML/XHTML differences when it comes to elements/attributes and namespaces. If you load the XHTML page in any SVG enabled browser, and click the circle, you'll see that attributes such as namespaceURI et al are set. To be expected. Load the HTML document, though, in the FF nightly, and again, you see what is expected in an HTML document at this time, which doesn't acknowledge namespaces: the namespaceURI value is set to the SVG's default namespace, the prefix is null, and the localName is set to "dc:title". This is the DOM difference that Henri discusses. At the same time, though, you can use the same functionality to access "dc:title", regardless of whether the document is XHTML or HTML. In fact, the DOM differences are pretty trivial -- no more than what we've had to deal with when it comes to Ajax applications and differences with XMLHttpRequest, or how we still have to manage differences in event listeners -- test for a value, and act accordingly. My preferences would be to elegantly manage namespaces for both XHTML and HTML in such a way that we don't have these DOM differences. However, evidently this causes other problems, so I'm not going to push the issue. Regardless, within the SVGDocument, we should allow XML without throwing conformance errors. Given that the Firefox nightly has no problems with RDF/XML within the SVG element, and given that differences in the DOM are not significant between the two, I don't know why we would not support "dc:foo" in SVG. In fact, I think not supporting valid XML within the SVG is sufficient to trigger very serious concerns about support for inline SVG in HTML, and hence very serious concerns about HTML5. Perhaps what the validator should do is throw an informative warning, telling the person that there are DOM differences between how namespace is handled in an HTML document as compared to XHTML documents. Still, I am hesitant about this, because this really only matters to those who need to access the DOM, and that's not the majority of web authors. We should be wary of injecting too many warnings, as we'll overwhelm the average web page author. A better approach would be to provide documentation to JS developers. In fact, some tweaks of most of the popular JS libraries would hide even these minor differences. Shelley [1] http://twitter.com/hsivonen [2] http://burningbird.net/newbook/testhtml5.php [3] http://burningbird.net/newbook/testhtml5.xhtml [4] http://lists.w3.org/Archives/Public/public-html/2009Sep/0307.html
Received on Sunday, 6 September 2009 20:28:26 UTC