More on SVG within HTML pages

I had an interesting twitter exchange with Henri[1], about the 
validator.nu's handling of an HTML5 document with inline SVG.

I have two documents, both HTML5, one served up as HTML[2], the other as 
application/xhtml+xml[3].

The HTML document throws several errors in validator.nu. One is the 
presence of an SVG element in a paragraph. According to Henri, this 
error came about because it's some form of warning since no browser 
currently supports SVG in HTML. However, the Firefox nightly will 
support SVG in HTML5, if you set the html5.enabled configuration option 
to true. Regardless, this isn't an error, but it's also not what 
concerns me.

The SVG is valid SVG/XML, copied as one would find SVG in the wild. It 
references several vocabularies with given namespaces, including Dublin 
Core, Creative Commons, etc. All of the RDF annotation is within an SVG 
metadata element. Again, nothing in what I described is unusual.

Henri's validator ignores the RDF/XML in the XHTML document, which is 
fine. But the validator throws several errors related to the RDF/XML in 
the HTML document. When I asked him about it in Twitter, he responded 
with, "Also, the dc:foo stuff is not even supposed to be valid in 
text/html".

Yet there's nothing that I could find in the HTML5 specification that 
states this. More importantly, this is a HTML5 failure in waiting, 
because if people inline SVG, chances are they will inline whatever SVG 
they find in the wild, which may or may not include RDF/XML. Validly 
include, may I add, in fact recommended when it comes to annotating 
Creative Commons license info.

When it comes to parsing the page contents, the HTML5 specification does 
reference two documents, the HTMLDocument, and the SVGDocument, and 
states that when the SVG element is parsed, its contents are added to 
the SVGDocument. In XHTML, currently, the inline SVG would be merged 
into the overall page Document, for a combined model. It sounds like the 
same is happening with HTML in HTML5, except that there's nothing in the 
spec that discusses what happens if the SVG contains valid XML from 
other vocabularies. It doesn't discuss what happens with the namespaces 
within the SVG, and how these would be handled in the DOM.

I had assumed that the SVG would be turned over to the SVG parsing 
engine for the browser, which would operate the same regardless of 
whether the SVG is in an HTML document, or an XHTML document. However, 
as Lachlan just noted[4], there's a significant gap in understanding 
about how SVG/XML is handled in XHTML as compared to HTML.

We know what's supposed to happen if crappy markup gets embedded in SVG: 
the user agents are supposed to provide a facility that allows the user 
to extract valid XML, which means they will have to correct the crappy 
markup. It's not an elegant solution, since people will probably try 
copy and paste of the SVG from source, which means they could be copying 
and pasting crapping markup. But there's nothing definitive that I could 
find (I could have easily missed it), about well formed SVG/XML in HTML, 
and what happens from both a DOM POV and a validator POV.

My first inclination was to file a bug about the lack of specificity, 
but I thought I would first post a note to the group, to see if I had 
missed sections in the spec that explain this, and to see other point of 
views on how the SVG/XML should be handled.

I decided to see what happens with a browser that actually supports 
HTML5 with inline SVG. I downloaded the latest Firefox nightly and 
enabled HTML5. I then added script to both the HTML and the XHTML 
versions of the files that will access the red SVG circle in the 
document, which doesn't uses the default SVG namespace, and also access 
all dc:title elements. I then printed out several key values related to 
HTML/XHTML differences when it comes to elements/attributes and namespaces.

If you load the XHTML page in any SVG enabled browser,  and click the 
circle, you'll see that attributes such as namespaceURI et al are set. 
To be expected. Load the HTML document, though, in the FF nightly, and 
again, you see what is expected in an HTML document at this time, which 
doesn't acknowledge namespaces: the namespaceURI value is set to the 
SVG's default namespace, the prefix is null, and the localName is set to 
"dc:title".

This is the DOM difference that Henri discusses. At the same time, 
though, you can use the same functionality to access "dc:title", 
regardless of whether the document is XHTML or HTML. In fact, the DOM 
differences are pretty trivial -- no more than what we've had to deal 
with when it comes to Ajax applications and differences with 
XMLHttpRequest, or how we still have to manage differences in event 
listeners -- test for a value, and act accordingly.

My preferences would be to elegantly manage namespaces for both XHTML 
and HTML in such a way that we don't have these DOM differences. 
However, evidently this causes other problems, so I'm not going to push 
the issue.

Regardless, within the SVGDocument, we should allow XML without throwing 
conformance errors.  Given that the Firefox nightly has no problems with 
RDF/XML within the SVG element, and given that differences in the DOM 
are not significant between the two, I don't know why we would not 
support "dc:foo" in SVG. In fact, I think not supporting valid XML 
within the SVG is sufficient to trigger very serious concerns about 
support for inline SVG in HTML, and hence very serious concerns about HTML5.

Perhaps what the validator should do is throw an informative warning, 
telling the person that there are DOM differences between how namespace 
is handled in an HTML document as compared to XHTML documents. Still, I 
am hesitant about this, because this really only matters to those who 
need to access the DOM, and that's not the majority of web authors. We 
should be wary of injecting too many warnings, as we'll overwhelm the 
average web page author. A better approach would be to provide 
documentation to JS developers. In fact, some tweaks of most of the 
popular JS libraries would hide even these minor differences.

Shelley

[1] http://twitter.com/hsivonen
[2] http://burningbird.net/newbook/testhtml5.php
[3] http://burningbird.net/newbook/testhtml5.xhtml
[4] http://lists.w3.org/Archives/Public/public-html/2009Sep/0307.html

Received on Sunday, 6 September 2009 20:28:26 UTC