W3C home > Mailing lists > Public > www-math@w3.org > April 2006

RE: Technical reasons for some options taken on design of MathML

From: Robert Miner <robertm@dessci.com>
Date: Wed, 12 Apr 2006 09:31:17 -0700
Message-ID: <D1EFB337111B674B8F1BE155B01C6DD6D42661@franklin.corp.dessci>
To: "David Carlisle" <davidc@nag.co.uk>, <whitelynx@operamail.com>
Cc: <www-math@w3.org>

Hi.
 
> If someone could tell me where these millions of pages using MathML
reside, it would simplify testing process a lot.
>
> I don't know where they all are, but there are several hundred
documents here
> http://www.nag.com/numeric/CL/CLdocumentation.asp

Most of the MathML content I know of it not currently published on the
Web.  It exists in backend production processes, which is why I
clarified I really meant pages and not documents.  For example, I think
four or five journals of the American Physical Society have been
produced using MathML for the last 2 or 3 years.  As a very rough
estimate, in one issue of one of those journals in a month, I see about
50 articles with a length of around 15 pages on average.  That would
amount to something like 90,000 pages after 3 years.  

As David pointed out, the US Patent office has been churning out 1000
equations a week in patent applications for 6 or 7 years.  If there are
10 equation per page on average (a wild guess, hopefully on the
conservative side since I dimly remember from Karleen's talk he was only
talking about display equations) that is 100 pages of MathML /wk for a
total of something like another 30,000 pages. Similarly with the very
substantial enterprise publishing operations run internally by companies
like Airbus and others.

I won't run through my whole back-of-the-envelope calculation that lead
are millions of pages of MathML in existence, but the above examples
ought to suffice to show that even if I'm wrong, I'm not wildly wrong.

That said, it is clear that what people in this discussion are
interested in are examples of MathML published on the web.  In order to
have a ready list of examples, a couple colleagues and I started a list
for our own use at

	http://www.listible.com/list/web-sites-that-serve-mathml

I can't vouch for the accuracy of what is there now, but it might be
enough of a start to be worth sharing. I would be very interested in
seeing other people add their own sites using MathML to the list.  Also,
I know several groups working on math search are gearing up to
systematically crawl the web for MathML content, and they may have
interesting stats to report about MathML published to the web soon.  If
the idea of maintaining a list of sites that publish MathML catches on,
we could easily add a link to the list from the MathML homepage.

--Robert



Robert Miner
Director, New Product Development

- our address has changed -
Design Science, Inc.
140 Pine Avenue, 4th Floor
Long Beach, California  90802
USA
Tel:  (651) 223-2883
Fax:  (651) 292-0014
robertm@dessci.com
www.dessci.com
~ Makers of MathType, MathFlow, MathPlayer, WebEQ, Equation Editor,
TexAide ~


-----Original Message-----
From: www-math-request@w3.org [mailto:www-math-request@w3.org] On Behalf
Of David Carlisle
Sent: Wednesday, April 12, 2006 8:24 AM
To: whitelynx@operamail.com
Cc: www-math@w3.org
Subject: Re: Technical reasons for some options taken on design of
MathML



> If someone could tell me where these millions of pages using MathML
reside, it would simplify testing process a lot.

I don't know where they all are, but there are several hundred documents
here
http://www.nag.com/numeric/CL/CLdocumentation.asp


> Authoring tool makers constitute the only part of MathML community 
> that could be happy with artificially bloated syntax. Markup that
being unreadable and 
> unprocessable by humans, forces people to buy WYSIWYG toys, is perfect
solution
> for commercial software makers, who if I am not mistaken played
crusial role in making
> "political decision" that resulted current MathML syntax.

For what it's worth none of the MathML expressions in the documents on
the NAG site have been processed by a WYSIWIG editor. Most of the older
ones were converted from TeX, and current maintenance and new authoring
is done directly in a more or less mathml syntax directly in emacs.
The extensions from mathml used internally mainly relate to content
mathml rather than presentation, as the set of empty elements designed
for common "K-12 functions" doesn't really apply to the functions in our
library, and it's just more convenient to use <apply><Ai/> than
<apply><csymbol>Ai</csymbol> but this shorthand is easily expanded as
part of the general transformation from our in-house DTD to
XHTML+MathML.

Converting our in-house documents from SGML-with-TeX-math-fragments to
XML-with-MathML-math-fragments was of course a lot of work, but has
shown a lot of benefit, the mathematics is far more consistently marked
up now (TeX is so forgiving to authors:-) and the documents can be far
more easily re-purposed. Mathematical expressions originally just
intended for documentation are now used in code generation. Rather than
just documenting constraints on some parameter, we can generate the code
that checks the constraint. Note this is far easier in MathML where
every operator is explictly tagged than in some suggested alternatives
that make far more use of inline untagged text.

The implication that the Working Group (Currently Interest Group) is
dominated by makers of Commercial Wysywig systems is simply false.  I've
hardly ever used a WYSYWYG System and certainly have never written
one. The Working Group has always had strong representation from
Universities, Math Societies and standards bodies, potential users of
mathml documents, as well as Computer Algebra systems and yes, makers of
commercial math editors. One of the main reasons that I originally got
involved with this project was that I was interested in the possibilites
that could be achieved by getting old TeX hackers like myself in the
same room as people from Microsoft, Maple, Mathematica, AMS, Design
Science (MathType), and many other interested organisations and
companies and comming up with a syntax for mathematical expressions on
which everyone could agree.  (Which isn't of course the same as saying
every member of the committee thought every aspect of MathML was
perfect)

> From the first glance it looks like I have to pay more for bandwidth
> (MathML markup is several times heavier), 

Experience shows (despite your impressive efforts with CSS-only
rendering) that people are and were not happy with the typesetting
quality of such a mechanism and in practice if they don't use MathML
they tend to use TeX generated bitmap images. You typically (but
probably not always) _save_ space by switching from bitmap images to
MathML. You certainly gain a lot in typesetting quality for both screen
and print rendering from the browser.



> If number of MathML content on web
> will grow significantly it will be impossible to make drastical
changes in MathML 

Mathematics in US Patent documents has been coded in MathML for
some years (this was a very large number of documents growing at a very
fast rate when I last heard the details some years back)
http://www.mathmlconference.org/2000/Talks/karleen/
there are other similar large organisations (including NAG) with large
numbers of MathML documents. Do you really think making incompatible
changes to any language after 8 years public use is something to be
considered lightly?  Removing elements isn't really an option, if
necessary, new features could be added and old ones "deprecated" but
look at HTML, That's had deprecated elements for years and it's not
clear if the experiment in html4 (copied into xhtml 1.0) of having
parallel strict and transitional versions with and without the
deprecated features was a success, or whether it just spread confusion.

David


________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________
Received on Wednesday, 12 April 2006 16:31:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 20 February 2010 06:12:58 GMT