RE: Technical reasons for some options taken on design of MathML

 
White Lynx,

I'd like to ask you a couple questions in a slightly different
direction.  I understand that you view MathML as a step backward from
both TeX and older SGML DTDs.  And though you haven't quite said so in
as many words, I think you are probably willing to admit that at least
for non-browser uses, there is a community of MathML users with a vested
interest in the current standard, and that won't lightly abandon it or
permit changes to radical as to negate their investments in it.  As we
have both said, there probably isn't much room there for progress or
change (depending on you whether you think the status quo is good or
bad).  So we might as well agree to disagree about the usefulness of
MathML for non-browser applications.

However, I think we do both agree (and probably everyone else on the
list too) that the situation in browsers is not satisfactory.  So let's
focus on that for a while, since CSS-rendering of math is one of your
areas of expertise.  Bare with me while I recap a bit of background:

<BEGIN BACKGROUND>

The long-time goal I think we all are working toward is free,
ubiquitous, high-functionality rendering of math in browsers. The hard
part of that is "high-functionality" since GIFs of math notations
satisfy the free and ubiquitous part.  High-functionality can mean a lot
of things.  When I was still young and idealistic about it, I wanted it
to mean

- matches surrounding text rendering in a browser
- prints at full resolution
- available for cut and paste (ideally into all the reasonable math
contexts, like CAS software, common editors and document preparation
systems, etc)
- searchable
- accessible

In other words, I basically want the same list of properties for math
that are currently available for regular text.  After a decade of
working on it, however, it is clear that it is hard to achieve all of
these things simultaneously.  But if you give up one or two, the
situation becomes much easier.  Here is my personal analysis:

MathML support in browsers:  

     Does pretty well on the functionality list, at the expense of being
ubiquitous. 
     Searchability and accessibility are not quite there, but there is a
great deal
     of activity in these areas, and they soon will be there.
Similarly, cut 
     and paste isn't as universal as one would like, but there is a lot
of 
     functionality there.  The real problem, as we all know, is that you
have to
     install extra stuff into IE, Opera and Firefox to get MathML
rendering at all,
     and then you have to tune your page for a specific browser, so
pages aren't 
     cross-platform.

Other native math support in browsers:

     Good luck.  There is a school of thought that believes that if
there were
     a "good" enough math encoding, browser vendors would all implement
it.  I 
     personally reject that view, having put in a lot of effort lobbying
browser
     vendors over the years. 

Math support via CSS + JavaScript:

     The main examples in this category that I'm thinking of are your
work, David 
     Carlisle's work, and Davide Cervone's JSMath package.  There are
probably other 
     too. And we should allow SVG and any other widely deployed browser
rendering
     technology in this category too. 

     These solutions do better on ubiquitous rendering, in exchange for 
     somewhat degraded functionality.  The main problem with ubiquitous
rendering 
     is having adequate fonts on the client machine, and the differences
between
     major browsers in terms of XML, CSS and JavaScript support. The
main tradeoffs 
     in functionality are in accessibility, searchability and cut and
paste.  There
     are some theoretical limitations, but mostly it is a function of
getting a
     lot of software implemented and deployed.  The rendering quality is
also a 
     little weak. JSMath, for example, does nice rendering, but it comes
at the 
     expense of being a bit slow.

Math support via server-side content negotiation:

     If one is willing to set up a smart enough server, one can encode
the math 
     however one wants, and serve the highest functionality page content
a given
     browser can handle.  The actual display format could be either of
the previous two,
     or PDF or whatever.  This approach clobbers the ubiquitous reader
access
     problem, and provides high functionality for readers that care.
The trade off
     here is that this is much harder to content publishers.  To make
this really
     work requires the development and deployment of free, ubiquitous
*server* 
     software.  If something like this were built into Apache, say, we
wouldn't be
     having this conversation.


My personal views at the present point in time is that native math
support in browsers isn't going to change much any time soon.  The usual
suspects will keep working away at it, and there will likely be some
incremental improvement, and there is a long-shot chance that Safari
will add MathML support which would make a difference.  But I don't see
the HTML/XHTML problem going away, since IE7 doesn't seem to be any
different than IE6 in this regard, so that basic, highly-troublesome
problem will liekly be with us for the next 5 or more years.

I really like the server-side approach, and this is where I'm putting my
time and resources.  For example, I'm hoping to release an open source
PHP-based solution developed under a research grant soon. Given the
dominance of Apache in the server space, this seems most achievable to
me, since it requires writing and deploying the least amount of
software, and pursuading the smallest group of people to install it.
But, it is in some sense orthogonal to the client-side approaches.
Unless we are all happy with GIFs of formulas (which we aren't) then one
still needs client-side targets for the server-side solution to hit.

So that leaves CSS/JavaScript/SVG/whatever client-side solutions. As I
have argued before, the place where I see MathML taking hold and
thriving today is in back-end production.  So if there is some better
client-side display technology that I can transform my XHTML+MathML
source into, I'm fine with that.  If it happens to serve the needs of
folks that like to author web pages in Emacs, so much the better.  I'll
still advocate for XML+MathML as the best all-around format.  But I'm
not a nut -- it would clearly be a good thing if there were an easy way
to hand author web pages that "just worked" when uploaded (to a properly
configured server anyway...)  I know I've certainly authored many more
"throw away" web pages in my life than pages where the content merited
putting real effort into it.

Over the last couple of years, this Math IG, and Bruce Miller in
particular, spent a lot of time and effort talking to the CSS working
group about possible enhancements to CSS to help with math rendering.
That was done with MathML in mind, but the bulk of the discussion
focused on abstract rendering properties.  Moreover, Bert Bos, who has
long been involved with CSS work at W3C is now the staff contact for the
Math activity.  In the end, the CSS group concluded it didn't want to
add anything special to CSS 3 for math support.  To really do math
typography "right" would require too much work, and none of the more
feasible half-measures seemed worth it.  That is an over-simplification,
and CSS is still open to hearing proposals.  But basically, we should
think in terms of doing math with what is there now, and possibly with
the generic enhancments proposed for CSS 3.  However, unlike the native
browser situation, I think there really is still room for progress with
this approach.

<END BACKGROUND>


Given that background, my questions to you are:

1)  If you focus solely on doing the best cross-platform CSS+JavaScript
rendering of math possible, what should the markup for the math in the
page look like?  Divs and spans and tables?  Some other collection of
math tags?

2)  If you focus solely on changes to MathML that would improve
CSS+JavaScript rendering of it, what would those changes be?  Redoing
the table model and ellimiating mfenced?  Replacing mmultiscripts with
something else?  In other words, what is the short list of things that
were most problematic about MathML for CSS rendering?

--Robert 


Robert Miner
Director, New Product Development

- our address has changed -
Design Science, Inc.
140 Pine Avenue, 4th Floor
Long Beach, California  90802
USA
Tel:  (651) 223-2883
Fax:  (651) 292-0014
robertm@dessci.com
www.dessci.com
~ Makers of MathType, MathFlow, MathPlayer, WebEQ, Equation Editor,
TexAide ~


-----Original Message-----
From: www-math-request@w3.org [mailto:www-math-request@w3.org] On Behalf
Of White Lynx
Sent: Thursday, April 13, 2006 10:34 AM
To: www-math@w3.org
Subject: Re: Technical reasons for some options taken on design of
MathML


Bruce Miller wrote:
> Of course, you'd be kinda
> obligated to demonstrate that these new principles
> actually can solve the _whole_ problem better,
> or whatever you think is the important part of the problem.
> And in fact, it would need to be dramatically better:
The irony is that I don't see the space to make things drastically
better. 
Having isolated markup language without some extensible publishing
framework that can handle maths (like SGML+DSSSL) can hadly considered
to be step
forward. In this respect MathML stands conceptually behind both
SGML/DSSSL/EMS
and TeX/LaTeX.
Achieving drastical improvement in XML + CSS approach is also difficult,

conceptually it provides good publishing framework but technically
stands behind SGML (usually XML syntax is more verbose due to stricter
rules) with DSSSL 
(no math oriented properties in CSS). However XML1.0/CSS2.1 provides
sufficient functionality to pick up right direction and add
functionality
incrementally as capabilities of basic XML + CSS framework grow
stronger.

> Were browser developers slow to implement MathML because MathML
> was so poorly designed?  Or because "Who cares about math?"

If non-browser implementations of MathML florish (I did not try myself
but according to comments posted on this list MathML is widely adopted)
then "who cares" argument should not be the key problem, especially when
part of browser developers are former mathematicians themselves.
In case of authoring tools, convertors and computer algebra systems
one deals with MathML as isolated markup, while browsers deals with
bunch of 
complex standards that should work together. 
So both requirements and design affect quality 
of implementation. If design would be suitable for implementation in
browser's core
quality would be different comparing to what we have today. 


> Nevertheless, if you find that MathML is truly unredeemable,
> then you really should propose an alternative, and work
> to get it adopted.

Alternative exists cannonically, as a niche in XML + CSS, in this
respect there is no need to propose anything, especially when any
propsal is 
considered as proprietary stuff intended to undermine existing
standards.

> As you know, MathML also has a goal of representing the "meaning"
> of math, or at the least it's structure, beyond mere presentation.
> You might reasonably debate whether it _should_ have that goal,
> or whether it meets it, but it's there.

Semantic and structure could be orthogonal layers of the single markup.
I don't see the problem here (apart of documenting and updating huge
content vocabulary)
<formula>E = mc<sup>2</sup></formula>
Could be enhanced with content without affecting structural layer.
<formula>E = mc<sup role="power">2</sup></formula>
or more detailed
<formula>
<group role="Energy">E</group> = 
<group role="mass">m</group><group role="speed of light">c</group><sup
role="power">2</sup>
</formula>
or
<formula pronounce="Energy is mass multiplied by square of speed of
light">
<group role="Energy">E</group> = 
<group role="mass">m</group><group role="speed of light">c</group><sup
role="power">2</sup>
</formula>

> Thus, for example,
> encapsulating the base of sub & superscripts is important;
> a simple <sub> tag doesn't do this.

What if there are rules that allow you to identify base without
enclosing it explicitly?

> So, where do we go from here?  Redesign MathML from the ground
> up?  Redesign CSS from the ground up?  

Redesigning the CSS means developing completely new browsers and
changing the web. Redesigning MathML will not affect browsers and web.
It depends what is more important browsers or tools. 
For me the web is more important, there are other opinions however.

> Lobby for enhancements,
> clarifications and even deprecations in both MathML and CSS?
I don't expect to get much in this way.

>> I am 100% sure that if Math WG would do nothing at all and HTML WG
would just add couple of lines
>> <!ENTITY % Misc.class "| dformgrp | dformula | formula">
>> <!ENTITY % math PUBLIC "ISO 12083:1998//DTD Math//EN"
"iso-12083-math.dtd">%math; 
>> to XHTML DTD we would have much better web today. For me it is
cristally clear that
>> math community would quickly embrace ISO 12083 approach.
Robert Miner wrote:
> I disagree.  I don't think this would accomplish much of anything.  
> To embrace the ISO 12083 approach, someone would have to write all the
> software to make ISO 12083 useful.  Just adding those three lines
won't
> mean that suddenly ISO 12083 is implemented in IE, Safari, Firefox and
> Opera, nor would it address the really intractible problem of XHTML /
> HTML incompatibility in browsers.  Nor would it mean suddenly you get
> ISO 12083 output from Open Office and Word + MathType.  Nor would
there
> magically appear publishing workflow software based on ISO 12083.  ISO
> 12083 has been around nearly 20 years, and in the last decade, I know
of
> more ISO 12083 publishing software that has been scrapped than has
been
> implemented.  Nor would computer algebra systems support ISO 12083 (or
> could they) nor backend CMS and assessment systems or TeX translators
> nor commercial and open source rendering toolkits.  The list goes on
and on.  

a. Building user community around simple, human processable markup
language is much easier, it forms quite naturally.
b. Transforming documents from LaTeX and SGML DTDs (AAP Math, Elsevier's
DTD etc.) to ISO 12083 is easier then transforming them to MathML.
c. Implementing ISO 12083 basically means implementing "embellishments"
model and fence resizing, the rest can be achieved by applying CSS style
sheet.
d. If I am not mistaken (never tried to use) there were some WYSIWYG
toys with ISO 12083 support.
e. If computer algebra systems are happy with presentational MathML,
then what is wrong with ISO 12083, could it be worse?

> But the obstacles to putting in place a functionally
> equivalent ISO 12083-based system are practically insurmountable at
this
> point.  You would have to write a mountain of software for a market
that
> doesn't really exist.

Today it is probably too late to talk about revival of ISO 12083, I just
meant that
if in 1998 W3C would not replace it with MathML, today we would have
several problems less.

> As for the "math community" embracing any XML/SGML standard, I think
you
> are overly optimistic there too. 
Probably. But it does not mean that usability of XML markup does not
matter, and social factors 
like building creative user community does not matter either. Verbose
low level markup is an obstackle.

> The problem comes when someone says, gee, wouldn't it be nice if we
> could have the same kinds of knowledge management capabilities for
> scientific documents that we have for text?  Wouldn't it be nice if we
> could build the same kinds of digital libraries for science that we do
> for other disciplines?  Wouldn't it be nice if we could build that
> distributed authoring system, or validate our documents, or
> automatically extract metadata, or publish to multiple media, or
> effective search, or make accessible versions, or interact with
> computational back ends.

And would not it be nice indeed? Why not to make things better?

> Oh, wait, there is such a standard and it's called MathML...

WOW. What's the problem then? 

> But what I see is the other side of the coin.  I don't really see any
> space for progress with browser vendors, for example, or many of the
now
> largely frozen technologies with which math solutions must integrate.

Maybe boundary conditions are not perfect for integration, but they are
part of the problem
and if you don't take them into account it means that you simply don't
solve the problem.
Ignoring this conditions you can write arbitrary number of math DTDs,
then make "political decision"
inside WG to pick up one of them and then hope that for some day
boundary conditions will be relaxed
and you solution will be integrated with the rest.

> After stuggling for years and years, we do have at least *some*
browser
> support for MathML.
The point is that one could simply get certain level of support for
granted without any struggle,
if capabilities of existing standards would be reused wisely.

> To get even that required people like Roger Sidje
> spending years working for free on the Mozilla engine, and Design
> Science writing off a couple hundred thousand dollars in MathPlayer
> development to make a free IE plug-in. 
> And even then it required several
> letter-writing campaigns from publishing executives and prominent
> scientists to pursuade the Mozilla folks to leave on the math support,
> since at one point they told us they doubted math support was even
worth
> the added download size.

Now imagine if someone would implement inline-blocks and inline-tables
in Mozilla.
a. No one would ever try to remove support for this part of CSS 
b. Mozilla would gain de facto math support in XML + CSS framework
c. Whole web community (not just mathematicians) would benefit  from new
features
d. You would have much more freedom in improving/changing DTD as changes
in DTD would not require changes in rendering engine.

> My own view is
> that sooner or later, the basic HTML paradigm that rules the web now
> will be replaced by something else.

I expected to hear this. But if transition from HTML to XHTML already
took ages and is still far from
being finished (MSIE does not support XHTML unless it is served as
text/html in backward compatible way, 
Mozilla can not render XHTML pages incremantally) how long will it take
to change whole web architecture?
What will you do with huge number of existing web pages?

> To me, that next paradigm shift, when it comes, will be the next
> real opportunity to revamp the entire spectrum of electronic
scientific
> communication applications.  

Will I live that long?

David Carlisle wrote:
> The implication that the Working Group (Currently Interest Group) is
> dominated by makers of Commercial Wysywig systems is simply false.
In fact I don't really like conspiracy theories. So let's hope it is
false.

> You typically (but probably not always) _save_ space by switching from
bitmap images to
> MathML.

MathML is step forward comparing to bitmap and step backwards comparing
to either TeX or old SGML DTDs.

> Do you really think making incompatible
> changes to any language after 8 years public use is something to be
> considered lightly?
It is difficult, it is unlikely to happen and this is the reason why I
rarely post anything on this discussion list.


-- 
_______________________________________________
Surf the Web in a faster, safer and easier way:
Download Opera 8 at http://www.opera.com

Powered by Outblaze

Received on Friday, 14 April 2006 16:59:04 UTC