- From: <juan@canonicalscience.com>
- Date: Tue, 8 Apr 2008 06:08:36 -0700 (PDT)
- To: <public-html@w3.org>, <www-math@w3.org>
David Carlisle <davidc@nag.co.uk> wrote: > Given the existing implementation and experience in this area surely > MathML should not simply be "one of the options" it should be the > main option. For HTML5 to invent some new math markup unsupported by > any existing mathematical software would be a complete disaster for > the cause of putting scientific documents on the web. This seems to me an over dramatic statement would stop any possible improvement to the web would arise from research being done around HTML5. Let us analyze a case extracted from the real world. The original canonicalscience.com site was designed on XHTML + MathML. As Neil said {QUOTE Given the difficulties with putting out XHTML pages today} this was a source of problems. In what follows i will resume only the problems associated to the MathML part of the whole XML equation and why using a different markup has been a good option. Only presentation MathML was explored due to the unpopularity of Content MathML. ### VERBOSITY ### Verbosity always was an issue, specially when the typical examples of MathML (spec, Wikipedia, Wolfram) were substituted by typical research scientists math. Prototype software with large math expressions gives a 12x verbosity for MathML. About 4x for small expressions like <math xmlns='http://www.w3.org/1998/Math/MathML' display="block"> <mrow> <mo>d</mo> <mi>S</mi> </mrow> <mo>≥</mo> <mfrac> <mrow> <mo>δ</mo> <mi>Q</mi> </mrow> <mi>T</mi> </mfrac> </math> I can see here several authors who complaint about MathML verbosity. ### INCREMENTAL RENDERING ### I may wait around a minute when opening academic works in my Firefox 2 GNU/Linux 1000MHz 0.5Gb. Medium-size articles like this one http://hermes.aei.mpg.de/lrr/2001/1/article.xhtml Print previews and other tasks are also slow. Try to open four MathML articles at once on Firefox 2. When opening a similar article in 'microformat' over HTML, i can see the text before the math is completely rendered. Authors can start to read initial article whereas bottom part is rendered. Time is an issue for some people. ### EDITING ### Verbosity rules out manual encoding. This gave another problem due to the lack of adequate editor and tools. A few expensive applications were generating acceptable code. Interestingly they were not oriented to Office/Publishers environments. I completely agree with Ian when said {BLOCKQUOTE Anything we can do to make the language more maintainable will go a long way towards arguing for MathML over the alternatives } and with Jammes when said {BLOCKQUOTE The supposed benefit is not to MathML editors but to authors using text editors. I have tried writing MathML-in-XHTML using only a text editor and the experience was painful to say the least. I found that the verbosity made it difficult to enter and then difficult to fix when I had made a mistake. The sensible solution might have been to use something like itex2MML to keep the source equations in human-readable form but that would have involved keeping two seperate representations of the document, with all the associated problems that that causes. } It is also worth to notice that most of tools generating presentation MathML from some LaTeX or LaTeX-like code were not working correctly. Distler blog has been now cited as example of site using MathML, and someone introduced here Distler views about the HTML5 proposal. Well then one would remember to HTML5 people some of the problems with the MathML/IteX approach. The MathML code served from several pages of Distler's blog were analyzed on this list. This may be available on the archive. >From memory: i) Ultraverbose output. E.g. unneeded <mrow> around single <mi> and <mn> elements. Mozilla MathML site recommends to avoid extra <mrow> http://www.mozilla.org/projects/mathml/authoring.html E.g. code like next was not unusual on the blog <mfrac> <mrow> <mi>a</mi> </mrow> <mrow> <mn>2</mn> </mrow> </mfrac> ii) Use of visual tricks /a la/ TeX. That is, just the kind of tricks that MathML was supposed to avoid. I remember use of <msup><mrow/> to simulate prescripts and use of collections of <msup>, <msub>, and <msubsup> to simulate tensors instead using the specific MathML elements. iii) Problems with numbers. This was corrected in a posterior version of the software I think. iv) Completely broken code. I remember the case of line elements ds^2. *Visually* they look fine but aurally they did not because the structural code was the invalid <mo>d</mo><msup><mi>s</mi><mn>2</mn></msup> instead the correct <msup><mrow><mo>d</mo><mi>s</mi></mrow><mn>2</mn></msup> When preparing this message I have taken a look to Distler blog articles of this year to check the status of the MathML/IteX technology. It seems that point i) have been partially fixed since I did not see unneeded <mrow> on several simple fractions i have checked. Still on recent article http://golem.ph.utexas.edu/~distler/blog/archives/001560.html#more you can find next code <mfrac xmlns="http://www.w3.org/1998/Math/MathML"> <mover> <mi>H</mi> <mo>˙</mo> </mover> <mrow> <msup> <mi>H</mi> <mn>2</mn> </msup> </mrow> </mfrac> containing one unneeded <mrow> on the denominator. Editing in 'microformat' HTML is free from those structural problems and the code generated does not contain unneeded elements. I can type mathematical equations on blogs, emails, and forums, when the JS is activated. ### BROWSERS SUPPORT ### Or lack of browers' support. Yes, native support for presentation MathML has improved in recent times and some useful plugins are available to IE users but problems remain. For instance, users accessing to internet from Cibers and libraries (including several University or CSIC research Centers) have not plugin installed, cannot install by themselves because is not their home computer, and pleas to install are not taken into account by the corresponding dept. With the 'microformat' HTML approach they enter on the library, open the default IE on the computer, visit the sites, and they are seeing the math. ### HTML POPULARITY ### Most of the web is done on HTML format. I have not done formal statistics about academic sites and blogs but 99% of sites i visit do not use XML. This HTML popularity is a barrier for MathML approach. The 'microformat' is HTML compatible and will be HTML5 compatible. ### SEARCH ENGINES ### Neil has done next search {BLOCKQUOTE If I do a search on +mfrac +mi +mo +mml:semantics [note the mml: namespace prefix, which I didn't include in my previous searches] Google says that there are "about 7,440" hits. } I have repeated the search http://www.google.es/search?q=%2Bmfrac+%2Bmi+%2Bmo+%2Bmml%3Asemantics& ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:official&client=firefox-a and Google returns about 7530 hits. A look to source code for two first hits physmathcentral and biomedcentral and you can see that both contain *escaped* code. As Ian noticed the engine is returning pages with <mfrac>, <mi>, etc. like escaped code. Now i will search on a site containing real MathML pages. I search text for instance +tidy site:http://hermes.aei.mpg.de/lrr/ http://www.google.es/search?hl=es&client=firefox-a&rls=com.ubuntu%3Aen -US%3Aofficial&q=%2Btidy+site%3Ahttp%3A%2F%2Fhermes.aei.mpg.de%2Flrr%2 F&btnG=Buscar&meta= I get one hit: http://hermes.aei.mpg.de/lrr/2001/1/article.xhtml This is a XHTML page containing presentation MathML and contains the word "tidy" at start of section 1.1. Now I search math +mfrac site:http://hermes.aei.mpg.de/lrr/ http://www.google.es/search?hl=es&client=firefox-a&rls=com.ubuntu%3Aen -US%3Aofficial&q=%2Bmfrac+site%3Ahttp%3A%2F%2Fhermes.aei.mpg.de%2Flrr% 2F&btnG=Buscar&meta= And i get zero hits for pages containing fractions. Engine cannot find the MathML fractions. The site contains many xhtml+matml pages, http://hermes.aei.mpg.de/lrr/ with about a thousand of <mfrac> Now i repeat the search for the 'microformat' case over HTML. I search fractions +{NU site:http://www.canonicalscience.org http://www.google.es/search?hl=es&client=firefox-a&rls=com.ubuntu%3Aen -US%3Aofficial&hs=xjG&q=%2B%7BNU+site%3Ahttp%3A%2F%2Fwww.canonicalscie nce.org&btnG=Buscar&meta= and i get three results with pages containing fractions on the content. The same 3 hits result using the same search string on another engines Yahoo: http://search.yahoo.com/search;_ylt=A0geu7UUWvtH_cIAdlRXNyoA?p=%2B%7BN U+site%3Ahttp%3A%2F%2Fwww.canonicalscience.org&y=Search&fr=moz2&ei=UTF -8 Answers: http://www.answers.com/main/ntquery?s=%2B%7BNU%20site%3Ahttp%3A%2F%2Fw ww.canonicalscience.org&ff=1 Altavista: http://www.altavista.com/web/results?itag=ody&q=%2B%7BNU+site%3Ahttp%3 A%2F%2Fwww.canonicalscience.org&kgs=1&kls=0 Thus the non-MathML approach is also working here. ### COMPUTATIONAL SOFTWARE ### Yes, the 'microformat' is not supported by common algebra software and all that. It is still beta but final version will convert to standard languages. I do not see a problem here. ### RENDERING ### The microformat may be converted to different rendering formats as p-MathML, SVG, XSL-FO, XML-MAIDEN, etc. In a future one can write XML pages with 'microformat' math and convert to p-MathML on the fly and then natively rendered by Firefox and Opera clients. ### ACCESSIBILITY ### This is a point where probably the MathML approach was superior. There exists a large research/experience with several accessibility specific projects behind MathML is lacking in a novel approach in beta stage. Still I do not see any special advantage over the microformat HTML approach. In its final form it seems that accessibility will at least so good. ### CONCLUSION ### As final conclusion I am forced to think that recent alternative models are not a "complete disaster" when compared with MathML. Juan R. González-Álvarez Center for CANONICAL |SCIENCE)
Received on Wednesday, 9 April 2008 06:22:52 UTC