- From: <juanrgonzaleza@canonicalscience.com>
- Date: Sun, 4 Jun 2006 06:12:33 -0700 (PDT)

Michel Fortin wrote: > > One thing I know however is that the next time I'll have to put an > equation on a web page, I won't go looking for a MathML editor just to > be able to generate the markup, convert the page to XHTML served as > application/xhtml+xml (so that it works with MathML) and ask my users > to install the required plugin or web browser just to see my equation. > I'll use an image: it'll be a lot simpler. Not so simple if you need add maintenance, search, storage, printing, and accessibility items to the list of requirements. > What Juan propose, about adding a limited number of elements to HTML > for maths, actually makes sense to me, especially if you can get not- > too-bad results with CSS. With more sophisticated design of CSS stylesheet and with more powerful CSS engines and good Unicode fonts one could achieve TeX quality output. > HTML is designed to be easy to learn and > write; if we had a markup like that for mathematics which integrates > easily in HTML it'd be much more used than MathML, I'm sure. And if you can reuse those tags in text (e.g. <table>, <sub>, or <sup>) and if you can use the same content model, DOM, and CSS techniques in both text and math, then that is cheap. With MathML you cannot. James Graham wrote: > In this situation, I imagine most scientists will simply write LaTeX and > use a tool to produce the output format that they desire. I doubt because LaTeX has not the sufficient capabilities for a full web design. > For MathML, there > is already a reasonable story here since Itex2MML exists, although it > really needs to be integrated with tools like hyperlatex if it is ever > to be widely used. I would also argue that the difficulty of providing > suitable imaged-based fallback content is a massive hindrance to the > adoption of mathematical markup. I proven that MathML code generated by IteX tool is very bad in several occasions (in my weblog and in the official MathML mailing list). > Look > at the test page - some of the rendering is awful (the radical signs in > particular stand out here). The approach was designed to be minimalist. Of course it can be improved. Moreover, radicals (looking better than in Firefox with native MathML support) could be best rendered via future CSS embellishments for math. > And, despite being sold as a simpler > solution than a MathML implementation, it works in about 1% of UAs (by > number of users) compared to > 95% that have a story for native or > plugin-based MathML. Original approach works in many rendering engines including off-line engines as Prince. The approach has been recently generalized to work also with several XSL-FO formatters (MathML does not work in FO). Current problems are in current implementation of CSS standards rather than problems with George approach. For example, it is needed good support for inline CSS blocks. Firefox has a bug on that. The same bugs affected Opera 8 and Prince 4, but were solved. Prince developers fixed bug in some few days, whereas they were unable to integrate MathML in the rendering engine in despite of many efforts. The bug in Firefox is scheduled for a next release of Firefox. Therefore inline CSS blocks will be correctly rendered. However, there is not schedule for the unification of MathML in browser not for a full support of the 2.0 specification. There is not schedule for implementation of MathML in Opera, MSIE, FO formatters... Moreover, George is now working in a cross-browser version of stylesheet (using for example moz extensions as alternative to above bug) since three or four weeks ago. My last news are he achieved a standard stylesheet working in MSIE, Safari, Opera, and Firefox and also in several off-line CSS printers and some FO engines for almost all the math (limitations are due to partial support of standards in browsers but situation will be better in future once full support for the standards). This is more that MathML has been able to do in 10 years using specialized tool, specifications, markup, browsers, dozens of publicity efforts in journals...! And do not forget that nobody know really Opera browser statistics, because the Opera simulation mode doing that statistic tools confound it with others browsers as Firefox or Explorer. >The language that they have used is also overly > simplistic. They? Overly simplistic? > For example one would expect most text in a formula to be in > italics except where actual words were being used in which case the > text should be roman. So you need an additional element to distinguish > text from ordinary numbers. Add a few more considerations like that and > you soon have a language that's just as painful to hand-author as > MathML (which, I agree, is far from perfect) and little support among > end users. Use roman text by default (as in text) and use <var> for italics. In HTML you write <p>This is an <i>important</i> text</p> instead of (<r> for roman) <p><r>This</r> <r>is</r> <r>an</r> <i>important</i> <r>text</r></p> MathML uses last approach, which is both verbose and redundant. Moreover, instead reusing working tags, MathML introduces new ones. For example you can use the <var> tag in pure text but you are forced to use <mi> when writing a MathML fragment. > Also, I think it's worth mentioning that trying to get accessibility > right for Maths content is likely to be extremely challenging. The > chance of authors investing the time to allow a semantic e.g. > spoken-word representation is 0 (this is incompatible from the > 'everything will be generated from LaTeX hypothesis above). So I think > it would be useful to know what actual scientific users currently do > when faced with mathematical content in e.g. a PDF document. This is a question of personal responsibility. If journal X encourages you to submit articles in AmsTeX via ftp, you do not send a 5 ? floppy by ordinary mail containing article in Atari text processor. If you have no interest in accessibility, ok. But if you are interested in accessibility or you are forced to provide accessible content by law (e.g. official bodies in several countries) then what? MathML? Content do not work for most of math and is not implemented in browsers. Presentation MathML cannot disambiguate expressions even if were perfectly encoded, but most of presentational code is far from perfect, for example IteX tools are encoding prescripts via tricks (the same tricks avoided in ISO 12083) instead of using the specific <mprescript> tag of p-MathML. What is poor, since authors are encouraged to not learn the final MathML code, they have no idea if are encoding a = b + c or sqrt(pi) instead. Take the case of Distler blog (the same problem also in Living Reviews on relativity articles generated by HERMES). He is encoding and serving 2 s ds when initially he tried to encode (ds)^2. In short, MathML code is being served in the web is less accessible than using old GIF + ALT model. I see no reason for use a technology does poor than available ones. "Mihai Sucan" wrote: > I > am currently using only Mathematica Notebook documents because MathML > is not supported by Opera and Gecko's support is not something I > consider awesome. Hi. The problem with MathML is that is not compatible with other web standards and, therefore, very difficult to implement. There are also difficulties related to semantics, automated searching, print, and accessibility. > Good-enough implementations for MathML would probably make-up for some > of the bad things in MathML. Quality implementation = quality programmers + quality specification Original HTML-Math (initially designed by w3c) was so bad that was extraordinarily rejected. Subsequent MathML specifications contained several errors and design mistakes doing very difficult or even impossible implementation and spreading of the markup. Take the example of content MathML. Authors apparently ignored decades of experience on symbolic mathematics and provided several specifications with sound mistakes. Some of mistakes were fortunately corrected in last content MathML 2.0, but others remain. Recently, it has been proven that something so simple as integral sin (x) dx is not correctly encoded in content MathML, whereas it *can* be encoded in OpenMath (or in other approaches). If your markup is doing poor that those old techniques why would you waste time on implementation of something that will not work even if you are the perfect programmer? Take now the case of presentational markup. How do you encode superscripts in TeX, in HTML, or in ISO 12083? A_b, a<sub>b</sub>, and a<sub>b</sub> respectively. In MathML it is done (ignoring tokens) like <msub>a b</msub>. Then you obtain difficulties for correct translation of TeX sources, difficulties for correct implementation of MathML in rendering engine of browsers and difficulties for printing, backward incompatibility with ISO 12083, and HTML... What is more, in ISO 12083 authors provided a very powerful script model. This has not been achieved by the posterior MathML. Since you are introducing base inside in MathML, you cannot easily amply the markup. For example if I want add superscript I add one new tag: ^ in TeX, and <sup> in both HTML and ISO 12083. You cannot do it in MathML because base conflict and this obligates to you to introduce two new tag and a new content model: <msup> and <msubsup>. With *less* tags, ISO 12083 can encode script structures could *not* be encoded in MathML. For example I need 5 and 6 scripts structures can be easily achieved copiyng ISO 12083 markup model for scripts (model is improved in my own canonical language). Those structures cannot be encoded in MathML because limitations of markup. Then I suggested next new encoding for next MathML 3.0 in official MathML list <msubsupunderover> Base script1 script2 script3 script4 <msubsupunderover> With each new script structure was not covered by p-MathML you will need a new content model and new tags. > If implementors can reach an agreement > together, they could even break the current MathML. If their new ideas > would prove good, then be sure the W3C will include those changes. Unfortunately, MathML WG has officially said on that. > Another different take: > If LaTeX is considered to be the best available language for writing > mathematical scientific documents, and the best for printing too... why > not have user agents implement it? It is not THE best. It is very good (but boring) at mathematical typesetting but is not good enough for web and reason was rejected for several mathematical markups (ISO 12083, EuroMath, MathML, OpenMath, XML-MAIDEN...). >> But biggest error was try to use MathML. MathML is full of incorrect >> design options and technical holes! Even some MathML author recognizes >> that content MathML was not "well thought" due to lack of agreement on >> the >> committee. > > You entirely dislike MathML. What do you think of OpenMath? And many, many other people (this is reason of lack of popularity and support after of 10 years and many propaganda). OpenMath is just for content and lacks presentational/structural capabilities. Number of tools is really small :-). OpenMath?s design is more solid than Content MathML 2.0 in some points but I am not sure of its capabilities for correctly encoding meaning of mathematical concepts not I am sure of possibilities as universal data-format between tools. > Make your own proposal. Which is the currently available standard you'd > like implemented? None? For what? typesetting? semantic web? computerized mathematics? For structure and presentation i think that ISO-12083 international standard for electronic documents is good enough. In fact, XML-MAIDEN is a modification of ISO-12083 (designed before XML and CSS) for doing it more browser accessible (e.g. CSS friendly). In fact, I offered p-MathML support since 2005 in canonical website but I just obtained headaches in despite of many effort, dozens of tools, plugins, fonts, MIMES, lot of emails, and so on. I decided abandon MathML support. > I'd be interested of your Canon (Markup Language). Thanks! M is for meta :-) Please copy anything of interest and report me errors or best ways to do things. >> 1) Insanely complicated and inefficient. In some cases, I have >> computed 15 >> times more bandwidth and server storage when using MathML than >> alternatives. > > Bandwidth is becoming more and more less of a problem. Hard disk space > is not close to a problem, since it's cheap and everybody who's > serious about working with lots of data has terabytes storage. I > myself have close to a TB, and I am not doing anything too special. The tendency in web design (and web recommendations is just to save kbs) Look for the first of the benefits of compliant w3c web sites on the list: [http://snews.awddesign.co.uk/snews/designs/snews_business/index.php?id=9] I find interesting W3C effort on CSS layout models for saving bandwidth (see figures) [http://www.maccaws.org/kit/primer/] [http://www.w3.org/WAI/bcase/benefits.html] [http://www.zeldman.com/dwws/] whereas MathML markup breaks completely the tendency providing markup can be 15 or more times more verbose is reasonable. I also find very interesting this reply from the MathML FAQ [http://www.w3.org/Math/mathml-faq.html] <blockquote> Does the verbose MathML syntax takes a long time to transfer across the net and parse? By comparison with existing image based methods of embedding math(s) in web pages, for example GIF files, MathML is relatively quick to transfer and process. </blockquote> and also I find amazing that *full* fine parallel markup was so insanely verbose and unmanageable that own MathML WG was obligated to introduce an alternative approach. I would recommend anyone encoding of E=mc2 in full parallel MathML markup as an exercise. > As for alternatives to MathML, be sure they've got their set of > weaknesses. MathML shouldn't be trashed just because some people don't > like it. It is simpler than like/dislike: work/does-not-work. > The specification didn't just reach recommendation suddenly, > without receiving any positive appreciation and implementations. Well I do not know details of history but I know that group has been defined as immune to external criticism. I also know that design of MathML was full of political pressure because internal wars in the committee. This is even recognized by Neil Soiffer! <blockquote> >From the earliest days, the MathML working drafts included structured presentation. This was not without controversy </blockquote> Presentational markup violates basic web guidelines. It is amazing that whereas <font>, <i>, and <b> are deprecated from HTML, MathML introduces lot of presentational tags. <blockquote> some members were very opposed to disambiguation characters </blockquote> Understandable. They do not correctly work. A problem about this was discussed last month in MathML mailing list. I already did not supported disambiguating characters in my previous CanonMath approach (after abandoned) but alternative idea has been reused in my previous program. <blockquote> Human authorable MathML was one of the goals listed in the MathML charter. Many people felt human authorablility was one of the reasons HTML was so successful. </blockquote> but add <blockquote> Ultimately, the MathML committee couldn?t reach agreement on an input syntax and decided that the marketplace should on the syntax. </blockquote> <blockquote> Many people asked why didn?t we use TEX? Which part of TEX? TEX is not amenable to the growing number of XML tools such as CSS, XSLT, DOM, parsers... </blockquote> And now we see people using TeX as input syntax because MathML is so insanely verbose. At the end, the MathML code generated from TeX syntaxes is so limited as initial TeX syntax is. About content MathML <blockquote> Not all members were in favor of it for MathML 1. </blockquote> <blockquote> Many of the people proposing content were opposed to presentation and the earlier form allowed specifying some things not possible in the latter notation. </blockquote> <blockquote> Due to greater emphasis and discussion on presentation MathML, content MathML was not as well thought out as Presentation MathML. As evidence of this, MathML2 has many content fixes (deprecates <fn> and <reln>), and adds some tags that were glaring omissions (eg, <lcm/>). </blockquote> One of points I find more surprising is that just a month or so one of MathML authors asked in public that would be changed in MathML for fully compliance with CSS standard! After of 10 years and 5 drafts/recommendations they are asking for compatibility with CSS now! Something is not working here. > IMHO, a good implementation for any math-related web technology must not > ask the user to download fonts, to install some plugin or anything > similar. I do not like Gecko for the fact it asks me to download > mathematical fonts. Still poor, Gecko rendering engine is optimized for *those* fonts and if changing fonts you (they) may rebuild the rendering engine and serving a new browser version would be download and installed by clients together the new fonts of course. > Math WebSearch - A semantic search engine > http://search.mathweb.org/ > http://kwarc.eecs.iu-bremen.de/software/mmlsearch/ > > I'm not sure if searching math is entirely a myth. This is a recent > guided research project done by a student of Dr. Kohlhase. I was referring to MathML. Somewhat as MathML is not very popular at the browser side it is not popular at the search engine side. >> ************************************** >> Proposals (from less to more radical): >> >> A) Eliminate next text from specification >> >> "Authors are encouraged to use MathML for marking up mathematics" >> >> because authors would use more concise powerful and solid markup for >> mathematics. > > I don't agree because authors would just use images with no alternate > text or... they'd even give up adding math to their page. If they > don't, they'll find a way to use a WYSIWYG editor to generate the > borked HTML code that looks "perfect" in IE & FF. and using MathML academic journals are encoding (ds)^2 as 2 s ds <mi>d</mi><msup><mi>s</mi><mn>2</mn></msup> whereas via simple HTML I can write ds<sup>2</sup>. What is poor? And author of self-proclaimed semantic HERMES project is serving XHTML docs where layout is done with empty paragraphs <p></p> and authors or dates are encoded as headings of level 3. But that is not a problem of XHTML it is a problem of the author. >> C) A more complete approach is providing a set of structural and/or >> semantic tags for usage with HTML5. > > Mathematics is a field which is very complex and you cannot dream of a > simple solution for displaying advanced scientific documents. Then MathML or OpenMath are a waste of time. Why recommend former in the WHATG specification? Juan R. Center for CANONICAL |SCIENCE)

Received on Sunday, 4 June 2006 06:12:33 UTC