- From: David Carlisle <davidc@nag.co.uk>
- Date: Sat, 29 Mar 2008 17:08:46 GMT
- To: ian@hixie.ch
- Cc: public-html@w3.org, www-math@w3.org
Ian, (personal response) > I'm investigating possible options for addressing the problem of "Putting > an equation in a Web page". One of the options is doing something with > MathML. Given the existing implementation and experience in this area surely MathML should not simply be "one of the options" it should be the main option. For HTML5 to invent some new math markup unsupported by any existing mathematical software would be a complete disaster for the cause of putting scientific documents on the web. > Could you point me to further information on this? I'm interested in > investigating how much support there is for editing MathML content, in > particular, since it's not a very human-friendly format. Actually MathML is more "human friendly" than some think. (Here for example we maintain a corpus that includes half a million or so mathml fragments, mainly using emacs rather than a specific math editor). Mathematics by its nature has always required a lot more markup than plain text, and that is true whether it's TeX or MathML or OOMML, or OpenMath. There are however several specific math editors that emit MathML (see the WG's implementation page) but mathml support is also available in more general software, in particular the leading computer algebra packages mathematica and maple will both export as mathml, and both Microsoft Word 2007 and OpenOffice have MathML support. (Microsoft Word converts to MathML on cut-and-paste, OpenOffice stored maths in MathML as the native format in ODF mathematics) All these systems have graphical formula editors, and linearised input syntax for mathematics that mean the author need not know mathml markup. > Cool, that's very encouraging. Any knowledge you have about that would be > great. Is there any documentation on common MathML errors? Is there any > documentation on what elements could be implied? Is there any reason > digits couldn't imply <mn>, for example, and punctuation couldn't imply > <mo>? Any help here would be greatly appreciated. I think the assumption here was that in an html context one might want to give up some of the rules coming from XML parsing (attribute quoting, perhaps some element closing, etc) I think it would be a mistake to try to insert character level tokenisation and parsing to imply token elements such as mn and mi. The strength of a format like MathML is that such tokenisation is explict (and one of the problems in converting from say, TeX, where these things are not explicit is that different systems have different heuristics. > MathML is a very big language, with just shy of 190 unique elements in > MathML2 (HTML4, including all the deprecated elements, has but 91). Could > we get away with making that simpler for HTML, e.g. by not including > support for Content markup in the text/html variant? I think you should aim for the support level of mozilla. So basically just supporting presentation mathml (which brings the element count down to a handful of structural forms) but support <semantics> by rendering its first child and skipping over any annotation-xml children with display property of none. So annotation-xml ought to be able to be take as content any well formed XML, but the only requirement for html5 would be to parse to the end of it, not to display content mathml natively. (Native rendering of content mathml3 would be nice but I think in the real world it's not going to happen everywhere) One thing we could do to make this easier for you is, in mathml3, more formally separate the grammars of presentation and content mathml so they are usable separately. > One of the use cases is the mixing of graphics and form controls into > equations. Is it possible to extend MathML to allow specific HTML5 > phrasing-level elements (like <em>, <img>, <input>, also maybe the <svg> > element) wherever the <mglyph> element is currently allowed, or something > along those lines? It's possible technically of course but I think it's fair to say that there isn't total consensus on whether it's a good idea. there are though two aspects to that question. In a purely mathml context, should mathml be opened up to allow any foreign markup there. or if in "pure" mathml that is not allowed, should html+mathml allow nested html (and docbook+mathml allow nested docbook, and as came up controversially recently should OOXML+MathML allow nested OOXML) The MathML2 spec said basically that if you nested other elements it wasn't mathml, but that if you did it anyway a system might not generate an error and might render it. This more or less allowed the mozill/firefox behaviour of rendering nested html in mathml, while allowing other pure mathml systems to reject "mathml" that contains nested html. This is an interesting area, and certainly something that we can talk about, exactly what the specs should say, or whether the individual specs should say nothing, but that an architectural specification such as CDF should specify how different formats can be mixed. David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________
Received on Saturday, 29 March 2008 17:09:19 UTC