SVG and MathML in text/html

This is an email copy of a comment posted at
(reformatted and typo-fixed for email)

Jacques Distler wrote:
> Perhaps because they had the impression that those discussions  
> weren’t going anywhere.

One crucial piece of information that the discussion needed was a  
thorough description of how namespaces in text/html work in IE 7.0.  
Microsoft could have provided that information but they didn’t.

Moreover, the (non-)communication pattern is the same in at least  
three W3C Working Groups (HTML WG, PFWG and WAF WG), so I don’t think  
the general pattern can be blamed on the appearances of any particular  
discussion in the HTML WG.

Hixie wrote:
> As I said last time you complained about this, namespaced content in  
> HTML is not a high priority,

Well, now at least one browser implementor is showing interest in the  
matter (in addition to the previous interest from writers of  
standalone parsers).

> A detailed list of use cases and use scenarios detailing what we  
> want to handle.

Use cases:

  * Converting a typical LaTeX paper to text/html such that
    everything that wouldn’t get bitmapped in a pdfLaTeX workflow
    does not get bitmapped.

  * Writing a similar document into text/html in a text editor
    copying and pasting the SVG figures from Inkscape XML output.

  * Making Flash-like visually “high-impact” (sorry about the
    marketing BS term) sites using the openly specified Web
    platform but without the Draconianness of XML in such a
    way that the whole thing uses retained-mode graphics and
    lives in one DOM for easy scripting (i.e. no need for
    scripts to deal with object or iframe sub-DOMs).

  * Publishing the kind of content that is published on using a
    legacy PHP content management system that is not XML-ready.

The technical requirements that arise out of the above use cases are:

  * Establishing a pseudo-XML parsing scope for <svg> and math.

  * Putting elements in the SVG and MathML namespaces in the
    DOM in <svg> and math scopes, respectively.

  * Establishing a nested "in body"-like parsing scope in
    foreignObject and annotation-xml.

  * MathML entities in the pseudo-XML parsing scope (have
    the tree builder toggle a flag in the tokenizer).

  * CDATA sections in the pseudo-XML parsing scopes (have the
    tree builder toggle a flag in the tokenizer).

  * Special-case XLink attributes.

> do we really want to allow mixing pure presentational, media- 
> specific, non-accessible or only-theoretically-accessible languages  
> like Presentational MathML, XSLFO or SVG into text/html?).

Yes, to the extent they are already part of the open platform on the  
application/xhtml+xml side. That is, yes, I (I’m not claiming to  
represent a “we” on this point) want to allow Presentational MathML  
and SVG in text/html.

The presentationalness argument makes no sense: If you’d be OK with  
including images by reference, moving presentatinal stuff into the  
same DOM and serialization does not magically make it worse than it  
was in the external HTTP resource.

As for accessibility, accessibility happens above the DOM, so the  
accessibility issues are the same regardless of whether the DOM was  
built from text/html or application/xhtml+xml.

I don't want to allow XSL-FO in text/html, since it is not necessary  
when browsers already are fundamentally interactive CSS formatters.

> A list of explicit non-goals, for example what we can learn from XML  
> Namespaces (e.g. that prefixes are bad), and what we may not need to  
> do (e.g. do we need to be syntax compatible with XML namespaces?

For the use cases I mentioned above, it wouldn’t be necessary to able  
to bind prefixes with an explicit syntax. Magic scoping on svg, math,  
foreignObject and annotation-xml would be enough. However, for the use  
cases to be satisfied, pasting in XMLNS-style default namespace  
declarations for svg and math should be allowed.

> Do we need to actually support arbitrary namespaces, or can we  
> satisfy all our use cases by providing explicit support for a finite  
> set of elements?).

For the use cases I mentioned above, support for arbitrary namespaces  
is not necessary. However, for forward compatibility, I think the  
mechanism should be scope-based rather than based on a finite list in  
order to handle future expansions of MathML and SVG.

> A study of what our constraints are given the existing parsing  
> quirks (e.g. our inability to use the ‘xmlns’ attribute due to the  
> conflict with IE’s wacky parsing).

I’m not equipped to do such a study.

> A serious consideration of whether we actually want to do any of  
> this, given that it would basically remove the only remaining reason  
> to use XML on the Web, and would thus likely kill XML for good in  
> this ecosystem.

It would be a shame to leave the SVG functionality in browsers mostly  
latent out of courtesy to the XML folks.

> I’m not saying MathML is a bad design. I’m not saying it’s a good  
> design. I’m saying something that should be blatently obvious to  
> anyone doing spec development, or indeed any kind of development,  
> which is that we don’t just design specs blindly without carefully  
> considering all options.

I’m saying MathML above, because I think the WHATWG doesn’t have the  
bandwidth to reinvent math markup *properly*. It seems that partial  
solutions are not enough:

> And what does that example look like in Lynx? (As a heavy Lynx user,  
> I care.)

I think compatibility with shipped Lynx versions is an unreasonable  
constraint. We could do almost nothing if we chose to be constrained  
by already shipped versions of Lynx (or Links for that matter).

Henri Sivonen

Received on Sunday, 9 March 2008 10:47:09 UTC