Re: The disappointment and embarrassment of MathML (update)

Walter,

"MathML and Mathematica (or Maple, MathCAD, REDUCE, etc., etc.) are not
similar items that can be compared."

Why not? The objects of comparison are functionality (what they can do),
suitability (of the underlying format for transmission over the Web), 
authoring (ease of), legality (open, proprietary), interoperability (with existing 
formats), archivability (appropriateness for maintenance of a large number of 
documents), searchability (how accurate), and so on.

"The proper discussion on MathML should follow a similar argument from 
many years ago when the WWW first became popular with Mosaic.  The 
basic discussion was to decide how a user would view a web page."  

1. 'How a user would view a web page' is a usability problem (that is also 
related to accessibility and localization) and little to do with the language. 
2.MathML Specification defines and provides guidelines as to what is a 
MathML conforming 1. document (for authors) 2. processor (for 
implementors).

"The question was between using a text editing/rendering program, such as 
MSWord or WordPerfect, with their proprietary data format or to have a 
separate but new Web renderer, such as Mosaic, and the then new text 
format HTML."

1. MSWord or WordPerfect are not editors, they are word processors.
2. It is unclear if that ever was the question as the comparison is between 
environments for desktop vs. the Web. Both have their share of trade-offs.
So people haven't given up (yet) on using either and they all co-exist.

"The conclusion was that using proprietary software and their data format 
would require all users to purchase and communicate with commercial level 
software that was not easily edited or rendered by other programs in the 
future."

Yes, but see next response.

"HTML defined a text based standard that all current and future text editors 
and renders could choose to support but would free all users from buying any 
additional software. The entire WWW population could view HTML web pages 
with simple free browsers, edit the page in simple ASCII text editor, and have 
support in all current and future software."

1. Whether the software for authoring/rendering is free or commercial is 
outside the scope of HTML Specification. It, for example, does not have
any implications whether it 'would free all users from buying any additional 
software' or users 'could view HTML web pages with simple free browsers.'
Attributions for freely/commercially available implementations of any format, 
open or proprietary, goes to those who implemented it. Open standards (such 
as, HTML) have free/commercial (HoTMetaL, Opera) authoring/rendering 
environments. Similarly, proprietary formats (such as, PDF) have 
free/commercial authoring/rendering environments. Some browsers that are 
freely available for personal use are not free if used in a commercial setting. 
Implementors may/may not make a software freely available depends on their 
internal strategy, and decisions based on it. Thus, the expectations of HTML 
designers and those of the implementors do not have a direct (or even 
indirect) correspondence.
2. It is quite likely that freely available authoring/rendering environments for
HTML popularized it.

"Not everyone wants to purchase [...] commercial package just to read 
someone's report and there are other users who like their text editing 
software."

Indeed.

"As a short term solution, many people render mathematics as GIF or JPEG
images on their web pages."

1. GIF, may be; although its compression algorithm (LZW) is proprietary. 
A better solution (nonproprietary, better compression algorithm that GIF)
is PNG. For example, WebEQ supports that. 
2. JPEG, may be not; it is for photographs (or graphics with that level of
complexity).

"Translators from TeX and other presentation based formats into HTML or 
DHTML greatly increase the degree data access."

1. Translators from TeX [...] into HTML. This is a classical issue. Mathematical 
objects get translated into images (LaTeX2HTML) or makes use of symbol 
fonts (TtH) or a specialized notation in a tag to be processed by a Java 
applet/plug-in or ... or MathML Presentation Markup (TtM). The prose 
surrounding mathematical objects gets translated into HTML.

The degree of data access from the viewpoint of making some mathematical 
content available on the Web remains neutral; before that data was on a 
desktop, now it is on the Web.

The degree of data access from the viewpoint of global availability on the Web 
increases if links are provided to it.

The degree of data access from an accurate searching viewpoint does not 
increase unless:
a. Some prose about the mathematical object being markup say in [La]TeX is
included within the $... $ or $$ ... $$ or \[ ... \] (as applicable). The
burden of translating them to say HTML comments <!-- .... --> and/or 
embedding them in a <meta> tag's description attribute, remains on the 
translator.
b. Extra semantically-oriented markup, that will be processed by external 
packages (xmltex), is included with markup of the mathematical object in 
the original (source) TeX document.

The above is mentioned with some reservations. 

The issue of semantics is tricky and reliable resource discovery of MathML on 
the Web remains questionable as it depends of several factors, including the 
query term (will "Newton" match only Newton's Method or also Wayne 
Newton?), search engine pattern matching language (is it designed to match 
element or attribute names, and/or the content they encapsulate?), and so on. 
In a closed environment, such as an intranet, the mapping of the domain (of 
search terms) and the range (of search objects) can be a lot more precise 
than on the Web as it is now. This is already being done successfully in some 
pilot initiatives with other markup languages. The situation is also expected to 
improve with metadata for mathematics (specifically, RDF Schema for 
MathML).
2. What does the JavaScript/JScript/VBScript part of DHTML (assuming 
DHTML := HTML + CSS + JavaScript/JScript/VBScript) have to do with 
translation?

"No one would want to type a Mathematica file by hand and such is the same 
for MathML. The user does not necessarily need to understand or be aware of
the data format since authoring and rendering environments will drive toward
compliant user interfaces and away from cumbersome techniques as the user
base increases."

1. Users of rendering environments (browsers with techexplorer plug-in) that 
do not involve any form based interaction for MathML processing do not
have to be concerned with the underlying data format.
2. WYSIWYG oriented authoring environments may (and do):
a. Export MathML documents that are non-XML (no PI, no DOCTYPE 
declaration) and so may not work with other environments that expect well-
formed XML.
b. Include presentation semantics (CSS) that may not be acceptable by the 
author. 
c. Export MathML documents with a specific version of the DTD, if at all, that 
may be deprecated.
d. Not support all the desired MathML elements and/or attributes. The file will
then need to be edited by hand.
e. In general may (some already do) lack flexibility to control
Authors therefore do need to understand the data format. That is essential for 
ultimate authoring control, among other things.

"With regards to the verbosity of MathML and more generally XML/SGML, I 
would state that new advanced markup languages will be coming later with 
more features and enhanced data constructs.  But let's just use XML/SGML 
for now since it's a good solution in the short run."

1. Which 'new advanced markup languages will be coming later with more 
features and enhanced data constructs?' How will they circumvent the 
verbosity problem?
2. Why is 'XML/SGML a good solution in the short run?' Why not in the long 
run?"

Pankaj Kamthan

Received on Tuesday, 18 April 2000 18:11:15 UTC