W3C home > Mailing lists > Public > www-math@w3.org > April 2006

RE: Technical reasons for some options taken on design of MathML

From: Robert Miner <robertm@dessci.com>
Date: Wed, 12 Apr 2006 09:03:42 -0700
Message-ID: <D1EFB337111B674B8F1BE155B01C6DD6D42638@franklin.corp.dessci>
To: "White Lynx" <whitelynx@operamail.com>
Cc: <www-math@w3.org>

Hi. 

White Lynx wrote:

> I am 100% sure that if Math WG would do nothing at all and HTML WG
would just add couple of lines
> <!ENTITY % Misc.class "| dformgrp | dformula | formula">
> <!ENTITY % math PUBLIC "ISO 12083:1998//DTD Math//EN"
"iso-12083-math.dtd">%math; 
> to XHTML DTD we would have much better web today. For me it is
cristally clear that
> math community would quickly embrace ISO 12083 approach.

I disagree.  I don't think this would accomplish much of anything.  To
embrace the ISO 12083 approach, someone would have to write all the
software to make ISO 12083 useful.  Just adding those three lines won't
mean that suddenly ISO 12083 is implemented in IE, Safari, Firefox and
Opera, nor would it address the really intractible problem of XHTML /
HTML incompatibility in browsers.  Nor would it mean suddenly you get
ISO 12083 output from Open Office and Word + MathType.  Nor would there
magically appear publishing workflow software based on ISO 12083.  ISO
12083 has been around nearly 20 years, and in the last decade, I know of
more ISO 12083 publishing software that has been scrapped than has been
implemented.  Nor would computer algebra systems support ISO 12083 (or
could they) nor backend CMS and assessment systems or TeX translators
nor commercial and open source rendering toolkits.  The list goes on and
on.  

As someone who makes a living working with real customers to use and
publish materials containing mathematics, it is crystal clear to me that
unless you have a critical mass of these pieces, you don't have a viable
solution for most organizations.  Could ISO 12083 have hit critical
mass?  It might have been possible.  There were serious discussions
about basing MathML on ISO 12083 in the mid-90s, but the majority
opinion was against it, for a wide variety of reasons.  The American
Institute of Physics, Softquad, Arbortext and others made a push to
develop software, with the abortive attempt to develop the Panorama
browser (remember that?) but the effort fizzled out.  After that, MathML
went in the XML direction rather than SGML, and aimed at a wider range
of uses besides publishing, and the train left the station.   

As a result, today when a customer approaches us with a need to build a
web-based distributed authoring system for an XML repository of test
questions to be delivered both on the web, in print and accessible
formats, there are possible MathML-based solutions.  Do they have warts
and gaps?  Sure. But the obstacles to putting in place a functionally
equivalent ISO 12083-based system are practically insurmountable at this
point.  You would have to write a mountain of software for a market that
doesn't really exist.

As for the "math community" embracing any XML/SGML standard, I think you
are overly optimistic there too. I think the large majority of members
in the "math community" don't have a problem with authoring.  Among
researchers, they spent years learning TeX, TeX is a nearly optimal
authoring solution for research articles, the TeX esthetic is deeply
embeded in the scholarly psyche, and PDF is an easy, pervasive way of
sharing output.  Not to mention, it's all free.  In short, they don't
feel like they have a problem.  Among non-TeX users, the situation is
similar.  They mostly use Word with the free Equation Editor or
soemtimes MathType if they are hardcore.  They learned how to use it
years ago, all their colleagues use it, and it's free to them.  They
don't feel like they have a problem either.  There are certainly authors
who aren't in these categories, but they are a small minority.  

The problem comes when someone says, gee, wouldn't it be nice if we
could have the same kinds of knowledge management capabilities for
scientific documents that we have for text?  Wouldn't it be nice if we
could build the same kinds of digital libraries for science that we do
for other disciplines?  Wouldn't it be nice if we could build that
distributed authoring system, or validate our documents, or
automatically extract metadata, or publish to multiple media, or
effective search, or make accessible versions, or interact with
computational back ends.  Oops, all our source documents seem to be a
tangled mess of TeX files, PDFs or scans, or Word documents.  Gosh,
wouldn't it be nice if there were a single math format that I can
convert all that content into, that a whole bunch of tool makers across
that whole huge spectrum of needs would support so there would be a
chance of creating that kind of functionality around math and science
documents?  Oh, wait, there is such a standard and it's called MathML...

I sympathize with the main thrust of the arguments that you and Juan are
advancing.  The verbosity, difficulty of hand authoring, and various
idiosyncratic quirks of MathML have been a drawback from the beginning.
If you focus on one part of the spectrum of mathematical communication,
it is clear there are better solutions possible, and I wouldn't even
argue that from a theoretical point of view, one could probably do a
better job addressing the whole spectrum than MathML does.
But I don't think you are properly acknowledging the breadth and
momentum of MathML usage, and thus don't have very good perspective on
what is involved in radical change.  You say:

> Once the main design principles remain the same, I see simply no space
for any progress.

But what I see is the other side of the coin.  I don't really see any
space for progress with browser vendors, for example, or many of the now
largely frozen technologies with which math solutions must integrate.
After stuggling for years and years, we do have at least *some* browser
support for MathML.  To get even that required people like Roger Sidje
spending years working for free on the Mozilla engine, and Design
Science writing off a couple hundred thousand dollars in MathPlayer
development to make a free IE plug-in. And even then it required several
letter-writing campaigns from publishing executives and prominent
scientists to pursuade the Mozilla folks to leave on the math support,
since at one point they told us they doubted math support was even worth
the added download size.  And the same thing applies in many other tool
categories.  In virtually none of those battles were the technical pros
and cons of MathML an issue, other than once in a while, it was key that
MathML was XML so that it fit in to some generic extension architecture.
If one were to impose radical change, at best it would be ignored for
half a decade while a new generation of altruists were recruited to
reimplement all the math software we have now to use it.  At worst, it
would kill off MathML-based solutions, without replacing them with
viable alternatives for many years. The annals of software development
are littered with stories about organizations that abandoned incremental
changes to flawed but working systems to make a radical generational
leap, only see both the old and new systems slip into obscurity.  

If by "progress" you mean radical change, I would strongly argue that
the window of opportunity has passed.  If you came up with the ultimate
XML math encoding tomorrow (whatever that means) I think you would have
a very hard time pursuading many tool makers to switch.  My own view is
that sooner or later, the basic HTML paradigm that rules the web now
will be replaced by something else.  For example, I look at leaks coming
out of Adobe about some combined PDF/Flash type thing.  Or maybe in a
decade, we will all use mobile devices with a completely different UI
model.  To me, that next paradigm shift, when it comes, will be the next
real opportunity to revamp the entire spectrum of electronic scientific
communication applications.  And I acknowledge the benefit from and
interest in trying to envision what such a solution might be today.  But
it won't be MathML. As you say, there simply isn't space for radical
change in MathML, any more than there is for HTML. 

--Robert


Robert Miner
Director, New Product Development

- our address has changed -
Design Science, Inc.
140 Pine Avenue, 4th Floor
Long Beach, California  90802
USA
Tel:  (651) 223-2883
Fax:  (651) 292-0014
robertm@dessci.com
www.dessci.com
~ Makers of MathType, MathFlow, MathPlayer, WebEQ, Equation Editor,
TexAide ~


-----Original Message-----
From: www-math-request@w3.org [mailto:www-math-request@w3.org] On Behalf
Of White Lynx
Sent: Wednesday, April 12, 2006 6:55 AM
To: www-math@w3.org
Subject: Re: Technical reasons for some options taken on design of
MathML


David Carlisle wrote:
> As  for  popular,  it's  clear  that  MathML  has  been  vastly  more
popular  than  
> any  previous  SGML  or  XML  markup  for  mathematics.   

I am 100% sure that if Math WG would do nothing at all and HTML WG would
just add couple of lines
<!ENTITY % Misc.class "| dformgrp | dformula | formula">
<!ENTITY % math PUBLIC "ISO 12083:1998//DTD Math//EN"
"iso-12083-math.dtd">%math; 
to XHTML DTD we would have much better web today. For me it is cristally
clear that
math community would quickly embrace ISO 12083 approach. Consider for
instance:
<p>
This is paragraph that contains famous formula
<formula>
E = mc<sup>2</sup>
</formula>
</p>
It fits much better in the general scope of HTML document then
<p>
This is paragraph that contains famous formula
<math mode="display" xmlns="http://www.w3.org/1998/Math/MathML">
<mi>E</mi><mo>=</mo><mrow><mi>m</mi><msup><mi>c</mi><mn>2</mn></msup></m
row>
</math>
</p>
Judge yourself, which one would have larger user community? 
Simple, easy to learn and easy to use ISO stanadard or bloated,
contraversial W3C recommendation.


> and  other
>  SGML  DTD  for  mathematics  (eg  Elsevier's)  were  pretty  much
only  used  by
>  large  publishing  houses.  

1. SGML itself was not widely used (unlike XML).
2. Web delivery of SGML documents was a problem (browsers did not
support it, and ability to display mathematics in browsers is a key
motivation for switching to XML) 
3. Lack of Unicode support in SGML tools made it less attractive

> many  of  them  are  switching  or  thinking  of
> switching  to  MathML  and  MathML  is  used  in  so  many  more
contexts.

The more people are switching to MathML the more important internal
problems of MathML become. 
Unfortunately W3C does not want to recongize and address these problems
and such an attitude undermines development of scientific web.

> If  it's  so  clear  that  ISO  12083  is  superior,  why  was  it
not  picked  up  to
> be  used  in  Computer  algebra  systems  
> (mathematica  and  maple  both  support  mathml) 

ISO 12083 is Electronic Manuscript Standard, as name suggests it is
oriented on electronic publishing.

> or  word  processors  (Word+MathPlayer,   OpenOffice,  AbiWord  and  
> SciWriter  for  example  all  support  MathML)  

Authoring tool makers constitute the only part of MathML community 
that could be happy with artificially bloated syntax. Markup that being
unreadable and 
unprocessable by humans, forces people to buy WYSIWYG toys, is perfect
solution
for commercial software makers, who if I am not mistaken played crusial
role in making
"political decision" that resulted current MathML syntax.

> Plus  of  course  support  in  web  browsers.
The reason why ISO 12083 is not good candidate for being supported by
browsers
is technical. ISO 12083 was designed before the current web standards.
It could be 
used with ISO other standards (SGML and DSSSL), but once one replaces
ISO SGML+DSSSL with W3C 
XML+CSS which is completely different framework, one immidiately
realizes that ISO 12083 
has to be changed in order to work consistently with the rest of web
standards (including
CSS and DOM). Resolving this problem is one of things that Math WG could
do, 
but instead MathML inherited all incompatibility problems of ISO 12083
and added even more 
headache to browser developers. I wonder why? In the same time I think
that browser
vendors should be involved in WG somehow, otherwise you tend to write
things that
by design are not suitable for implementing in CSS rendering engine
which is
core of any modern browser. As a result MathML is implemented "on the
top"
without decent integration with the rest of engine, this is what we see
in MathML enabled 
Gecko, in MSIE with MathPlayer, in Opera with UserJS (appropriate UserJS
is avilable for recent builds of Opera 9)
and Prince Formatter (work on MathML in Prince is still in progress). 
In these implementations MathML is modified internally to make it
suitable for formatting in CSS rendering engine
and needless to say such a modification affect CSS and DOM that are
either partly disabled or no longer work as
they are supposed to. 

> We  are  currently  looking  for  requirements  for  improvements  to
MathML  for
> a  possible  MathML3,  but  clearly  MathML  is  not  going  to  make
a  backward
> incompatible  change  to  its  script  markup.

Once the main design principles remain the same, I see simply no space
for any progress.

> I  suspect  that  what  you  want  to  do  is  design  your  own
> XML  DTD  and  then  have  stylesheets  that  translate  this  to
MathML  for
> public  use.

Do you mean that MathML is more suitable for public use then anything
else?
Let us consider simple example. With custom DTD I can type my articles
in plain text editor
like EmEditor ar Textpad without using WYSIWYG tools (WYSIWYG is good
solution for teenage girls who like to weave dreams thinking 
that what they see is what they get, but is less attractive for
scientist who wants to encode formulae accurately),
I can put them on web like this: http://geocities.com/chavchan/xml10/
And I can view them in almost any browser including Opera, Safari,
Mozilla and MSIE (argument is valid
for simple articles, too complex formulae would require better CSS
support in Mozilla, Safari and MSIE, unfortunately 
the day when we'll get interoperable CSS2.1 support in all major
browsers is still several years away, but progress is clear).

Now why should I translate them in MathML and how exactly it would
improve user experience?
>From the first glance it looks like I have to pay more for bandwidth
(MathML markup is several times heavier),
site visitors can drink some coffee while page is loading (here are
examples http://hermes.aei.mpg.de/ , can any browser render MathML
incrementally?
In XML+CSS approach at least two browsers Opera and MSIE render page
incrementally), 
I will add "Please download MathPlayer plugin" message for MSIE users,
ask Safari users to download another browser specially for viewing my
site, 
ask Opera users to get another UserJS patch and Mozilla users to
download extra fonts. 
What a great browsing experince! And this is only one part of the
artificially created problem known as MathML.
So why should one translate something to MathML?

Robert  Miner wrote:
> For  a  credible,  responsible  standards  organization  such  as
W3C,  that
> imposes  strenuous  backwards  compatibility  constraints.
XLink, MathML and XSL FO recomendations left me no chances to consider
W3C as credible and responsible organization. 

> W3C  cannot  and  should
> not  favor  the  interests  of  one  particular  interest  group  over
others.
Then why it favours interest of authoring tools makers and ignors the
rest of interested parties including
but by no means limited to authors that need reasonable, human
processable markup, browser developers that need markup compatible 
with the rest of standards issued by the same organization (W3C), users
that would be happier with less bloated solution and
database administarators that compress even latex files to save space? 

> This  is  particularly  true,  as  I  explained  in  an  earlier
message,  since
> W3C  is  directly  accountable  to  it  dues-paying  member
organizations,  and
> only  indirectly  accountable  to  individuals  with  no  official
standing,
> such  as  yourself.
Are not browser vendors dues-paying organizations? 

> But  I  encourage  you  take  the  trouble  to
> understand  the  interests  of  the  stakeholders  in  the
discussion,  and  the
> constraint  that  apply  when  considering  changes  to  MathML.

Personally I understand interests of stackholdes but don't plan to
respect any "interests" that 
clearly undermine development of scientific web, reverse evolution of
mathematical markup
 and may bring irreparable damage to scientific community. If number of
MathML content on web
will grow significantly it will be impossible to make drastical changes
in MathML and thus
all current problems of this markup language will stay with us forever
in the form of heavy 
legacy that restricts scientific web from rushing ahead. When standards
become much worse then they 
were ten and even twenty years ago it is time to ring the bells.

> Below  I  claim  there  are
> millions  of  documents  that  use  MathML.   I  should  have  said
millions  of
> pages.   
Some time ago I was searching for scientific articles with MathML
formulae to test
UserJS implementation of MathML in Opera and failed to find anything but
http://hermes.aei.mpg.de/
If someone could tell me where these millions of pages using MathML
reside, it would simplify testing process a lot.

Bruce Miller wrote:
> However,  I  think  that  your  comments  would  be  more
>  productive  
> ---  assuming  you're  actually
> trying  to  have  a  constructive  discussion
Productive in what sense? It clear that W3C will not change basic design
principles of MathML
no matter how constructive, consistent and relevant our comments are.
W3C left no space for productivity
and we are forced to seek for niches either in current web stanadrds
(XML/CSS/ECMAScript/DOM)
or take more radical approach and create something completely new like
Juan wan't to do.
Inspite the fact that Math WG is unlikely to change anything I reserve
right to express my concerns regarding design of MathML.




-- 
_______________________________________________
Surf the Web in a faster, safer and easier way:
Download Opera 8 at http://www.opera.com

Powered by Outblaze
Received on Wednesday, 12 April 2006 16:04:00 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 20 February 2010 06:12:58 GMT