RE: Correcting misperceptions about MathML usage

Robert Miner wrote:
> Hi.
>
> I noticed an error of my own I need to correct.  Below I claim there are
> millions of documents that use MathML.  I should have said millions of
> pages.
>
> --Robert
>
>
> Robert Miner
> Director, New Product Development
>
> - our address has changed -
> Design Science, Inc.
> 140 Pine Avenue, 4th Floor
> Long Beach, California  90802
> USA
> Tel:  (651) 223-2883
> Fax:  (651) 292-0014
> robertm@dessci.com
> www.dessci.com
> ~ Makers of MathType, MathFlow, MathPlayer, WebEQ, Equation Editor,
> TexAide ~
>
>
> -----Original Message-----
> From: www-math-request@w3.org [mailto:www-math-request@w3.org] On Behalf
> Of Robert Miner
> Sent: Friday, April 07, 2006 2:50 PM
> To: juanrgonzaleza@canonicalscience.com
> Cc: www-math@w3.org
> Subject: Correcting misperceptions about MathML usage
>
>
> Juan,
>
> The MathML specification is 10 years old, and has gone through four
> versions as a W3C Recommendation.  There are dozens of pieces of
> software that seriously implement it, and millions of documents that use
> it. For a credible, responsible standards organization such as W3C, that
> imposes strenuous backwards compatibility constraints. It means among
> other things that changing something as fundamental as the
> representation of scripts is no longer feasible.  The only changes that
> can be considered viable at this point in the lifespan of MathML are
> incremental.
>
> I believe you have a faulty impression about the usage of, and hence the
> constraints upon, MathML at this point in its lifecyle.  You wrote
>
> 	It is interesting that almost all of academic publishers are
> ignoring
> 	MathML promises and using other alternatives (at my current
> knowledge only
> 	_Blackwell_ publisher is using MathML). For instance, the
> renowned
> 	_Nature_ is working with ISO 12083.
>
> However, that is false.  Among major technical publishers, I know that
> Reed Elsevier, the American Chemical Society, the American Physics
> Society, Houghton-Mifflin, McGraw-Hill, Wiley, and the US Patent Office,
> to name a few, all use MathML in at least some of their publication
> workflows.  Similarly, you seem unaware of the use of MathML in
> enterprise publishing, but a short list of companies that I know are
> using MathML includes Airbus, Boeing, Schlumberger, Becthel-Bettis,
> Pratt and Whitney, Lexus-Nexus, NAG, SPSS and others.   Beyond that,
> MathML is used extensively as a backend technology by a number of
> educational technology vendors including course management systems such
> as WebCT, eCollege, and Blackboard, automated assessment vendors such as
> Questionmark Perception, Brownstone and ETS, and other educational
> service and software vendors. Another area where MathML is playing a
> significant role is accessibility.  The DAISY consortium of
> accessibility technology vendors and advocates is in the process of
> incorporating MathML into the DAISY file format. Consortium partners are
> already hard at work looking at adding MathML-based accessibility
> solutions to their software, and educational publishers in many contexts
> in the US are compelled by law to make accessible materials available in
> the DAISY format. This will have the effect of making mathematics in a
> large body of content effectively and seamlessly accessible and widely
> available to those with print disabilities for the first time. As you
> may have seen on the list yesterday, now there are even effective tools
> for using MathML right-to-left in Arabic-language documents.
>
> In all of these arenas, there is substantial and rapidly growing use of
> MathML, and W3C has a responsibility keep MathML stable for all of these
> stakeholders.  As a consequence, as I stated above, the only changes
> that will occur in MathML for the foreseeable future will be small,
> incremental ones that maintain backward compatibility and offer existing
> users a smooth, non-disruptive upgrade path.
>
> At the same time, it is true that the area where MathML has had the
> least impact is as a hand-authored format for academics and hobbyists
> publishing directly to the web.  This is obviously an area that you care
> very much about. But as a standards organization, W3C cannot and should
> not favor the interests of one particular interest group over others.
> This is particularly true, as I explained in an earlier message, since
> W3C is directly accountable to it dues-paying member organizations, and
> only indirectly accountable to individuals with no official standing,
> such as yourself.
>
> If you want to work on devising a good input syntax for MathML that
> meets your needs, and to present your ideas on this list for comment,
> you are welcome to do so.  But I encourage you take the trouble to
> understand the interests of the stakeholders in the discussion, and the
> constraint that apply when considering changes to MathML.
>
[snip]
>
> --Robert
>
> Robert Miner
> W3C Math Interest Group co-chair
> Director, New Product Development
>
> - our address has changed -
> Design Science, Inc.
> 140 Pine Avenue, 4th Floor
> Long Beach, California  90802
> USA
> Tel:  (651) 223-2883
> Fax:  (651) 292-0014
> robertm@dessci.com
> www.dessci.com
> ~ Makers of MathType, MathFlow, MathPlayer, WebEQ, Equation Editor,
> TexAide ~


Hi Robert,


Point A]

It is a pity that only changes to current MathML 2.0 specification can be
considered viable are incremental ones.

Some extensions as my suggested

<munderoversubsup>Base script1 script2 script3 script4</munderoversubsup>

could be incorporated in a future MathML 3.0 specification, but without
more serious changes I (as many others) find impossible to efficiently use
MathML.


Point B]

Yes, I was partially wrong about usage of MathML. Thanks by correcting me!
There are more academic publishers "using" MathML I cited. I have
actualized data on that point. Once recognized that, I may add that my
main idea about "fiasco" of MathML continues being accurate. I was able to
obtain very recent data and statistics on usage of MathML, and they
confirm my initial claim. I will not detail data here (I will write a
Canonical Science Today about this, because I consider it very important
in relation with the CanonMath program).

In the last (October 18, 2005) conference on scientific, technical, and
medical publishing was claimed that:

"Publishers have failed to adapt to the web", have not moved to MathML and
speakers even analyzed the causes: "Most publishers are not interested in
these issues because they do not fit into their business model."

In the talk "Web Publishing Mathematics With HTML and MathML from TeX"
(February 2006) were pointed a number of weaknesses of MathML, that "Most
of these problems are avoided if ordinary HTML is used", and offered most
up-to-date server-statistics on interest of translation to MathML. The
results are notable:

"There is substantial interest in translating TeX to HTML!"

I do not doubt that, after of my fiasco on offering scientific material on
MathML format I am reconsidering a reuse of SGML or even a return to the
"ancient" HTML!


Point C]

Moreover, I have obtained information on real usage of MathML by
publishers and I am astonished with the result. I focus on the biggest
academic publisher: Elsevier.

Yes, it is true that Elsevier "adopted" MathML in workflow because of
general policies of migration from SGML to XML DTDs, but I have been
informed of a number of very interesting points (June 2005). I will
provide technical details about this in a next Canonical Science Today:

1)
Rationale for migration to MathML is not based in strengths of the w3c
specification.

2)
Elsevier uses presentation MathML only.

3)
Elsevier is using an in-house modification due to limitations and
problems: "In all these cases we were forced to modify the standards, with
the risk of losing the benefits of adopting those standards."

4)
MathML is not enforced in Elsevier’s CEP for very simple formulae, since
"these can be structured with text effect elements."

5)
MathML is not being used for chemical formulae (such as CH_4) and
mathematical physicochemical data (such as IR spectra).

6)
Other mathematical structures as prescripts are not encoded in MathML.
Elsevier encoding of {}_{92}^{238}U is very similar to markup model that I
am adopting.


Point D]

About MathML accessibility, I have been able to obtain recent research and
other articles from developers doubting of the promise of accessibility (I
also doubt).


Point E]

About usage in education I can say that some people is adopting MathML
whereas other people is rejecting it [e.g. Denis Bouhineau, Alain Bronner,
Hamid Chaachoua, Sophie Mezerette, and Jean-François Nicaud. Maths CAA
Series paper (Nov 2005), "Computer Assisted Assessment in Elementary
Algebra Experiences and points of view from the APLUSIX project"]


Point F]

Arabic-language support on MathML presented here last days is impressive
but the advance is related to international support surrounding the XML
infrastructure rather than specific points of MathML design. Similar
improvements could be achieved by others markup models if relying in a
good internationalization base.

David Carlisle could correct me if wrong but I think that even TeX was
extended to Arabic and one can directly typeset math document in Arabic
language with formulas and symbols correctly spreading out from
right-to-left.


Point G]

As stated above, it is a disappointment that MathML WG is planning only
incremental changes over the current 2.0 specification.

I -as many others mathematicians and scientists- am found strong
difficulties for a complete and solid implementation of MathML in rest of
the workflow of the Center for CANONICAL |SCIENCE) and collaborating
external bodies and authors.

The Center for CANONICAL |SCIENCE) cannot reuse Elsevier’s book/article
model based in Elsevier’s own MathML version (not the standard),
complemented with specific presentational markup for chemical formulae and
simple "text" mathematics more other add-ons -including LaTeX, stripins-.
We need a more logical, simple (by pure economic questions), and optimized
extended approach (can be used beyond books and articles).

The modifications needed on current MathML for covering "our" needs would
vanish any practical advantage of using a "standard". Turning the Center
into a “dues-paying member of the w3c” for proposing (extreme) changes to
current MathML 2.0 specification is, according to the WG, not feasible.

The MathML 2.0 specification states:

"Corporate and academic scientists and engineers also use technical
documents in their work to collaborate, to record results of experiments
and computer simulations, and to verify calculations. For such uses,
mathematics on the Web must provide a standard way of sharing information
that can be easily read, processed and generated using commonly available,
easy-to-use tools."

Unfortunately, also this goal cannot be achieved in our case. I have
discussed this with colleagues last days, and I agree on the impossibility
for using MathML for that.

Let me take an equation is becoming very popular in chemistry [J. Phys.
Chem. A 2003, 107, 2657-2666] for illustration. According to my
colleagues, the memory requirements for explicit construction of the
Redfield equation for simpler models of chemical systems are of the order
of 7 Gb in *optimized files* (that is not the memory requirement for a
computational algorithm solving it, which is larger of course).

Neil Soiffer estimates an x11 verbosity factor (I do not know for
details). Professor Sevcik and David Tam report verbosity factors of
6.9--113.4 for more realistic content MathML equations.

Therefore, the *same* equation would need of order of 48--791 Gb if
encoded in MathML 2.0!!!

Yes, I know typical answer, "MathML compresses very well". But even
zipping it and interchange it with a colleague, it is still needed unzip
the package for accessing to the data contained.

Does MathML WG really believe that scientists would dedicate a new hard
disk (and maybe another processor) for unzip and read mathematical
information was encoded and transmitted in MathML. I recognize that
example I have obtained is first-quality research, but it is real in
scientific community.

Thanks by kindly invitation to develop an input syntax and discus it in
this mailing list, but since research of last months fixed many flaws and
limitations on MathML specification and seeing also MathML WG is
encouraging maintenance of current design in future specification,

**************
I will close the CanonMath project developing an input syntax for MathML 2.0.
**************

Of course, if anyone finds interesting some ideas there discussed can use
it for their own input syntax.

**************
All the mathematical part of the experimental canonicalscience website
(currently done in MathML 2.0) will be eliminated in next version. The
Canonical Science Report journal will not adopt support for MathML.
**************


I acknowledge the patient and interesting replies here from people such as
David Carlisle, Mikko Rantalainen, Paul Libbrecht, Andreas Strotmann,
Pankaj Kamthan, Luca Padovani, and you.


Juan R.

Center for CANONICAL |SCIENCE)

Received on Tuesday, 11 April 2006 18:03:05 UTC