- From: William F. Hammond <hammond@csc.albany.edu>
- Date: Mon, 17 Aug 1998 12:34:58 -0400 (EDT)
- To: www-math@w3.org
- Cc: emj@cnsibm.albany.edu
This is close to the draft of an article. I hope that I have purged
all of the typographical errors.
I now notice that I overlooked W3C's "Arena". There are probably
other significant things in the history of math on the web since the
dawn of external applications that I should mention. If so, please
write me.
It is a comment to the list "www-math" with a copy to the list
"emj", and there are a few blind copies.
------------------------------------------------------------------------------
Math on the Web
By William F. Hammond
[1]Michael Hamm <msh210@nyu.edu> writes to www-math@w3.org:
Do any browsers (esp. any versions of Mozilla or MSIE) read the
HTML3 MATH tag and the tags that go in it? Which? Thanks.
In a single word, the answer is no.
HTML 3.0 was a [2]1994 W3C draft that never got beyond draft stage and
was quickly superseded by [3]HTML 3.2 and, later, [4]HTML 4, which
contain no provision for mathematics. (Well, one may use "<applet>"
or, better, "<object>"; but that does not really give mathematics
fully reasonable access to the web.)
Subsequent to the demise of math-in-html W3C formed an HTML Math
Working Group whose work led to the creation of MathML, which is now a
[5]W3C recommendation with principal rendering implementations
available currently through (1) [6]WebEq applets under mass market
browsers, (2) the W3C testbed browser (and point-and-click authoring
tool) [7]Amaya, and maybe (I am not up to date) [8]IBM's TechExplorer.
I believe that the source code for Amaya is available for those who
wish to amend it. (For that matter I believe that all of the relevant
source code of [9]Mozilla, the public version of [10]NetScape is
available now, too. I believe that WebEq and TechExplorer are
proprietary with temporary free trials.)
While I understand and accept the reason for the exclusion of the HTML
3.0 math tags from HTML, we have been left with a situation that still
presents a serious barrier to the efficient flow of (unstyled)
content-level mathematical information through the web to robots,
small-screen displays, audio streams, and Braille streams.
For mathematics on the web, there is a sense in which one can say that
there has been very little progress in the last 5 years since it
became possible to have network browsing tools, both under "http" and
"gopher", quickly spawn external applications based on ``mimetype''.
It is unclear how much improvement will arise as things evolve from
the dawn of MathML. My guess is that MathML will serve the needs of
the mathematical, scientific, and engineering communities, while still
permitting the loss of much of what we understand as ``content'' from
many resources on the web when that ``content'' is mathematical in
nature. Of course, provision for these considerations exists in
MathML. The question is how much attention will be paid to it due to
the fact that it is more expensive to handle.
For example, I think that it could very well develop to be at least 10
years before mathematical content can be searched through major web
indexing and cataloging sites in any remotely robust way, while a
great deal more would be possible more cheaply if a few additional
arrangements were made for dealing crudely but faithfully with
mathematical content in basic HTML.
The arrival of the ``bazaar'' model of development in the [11]Mozilla
Project gives one hope that this will happen.
The early long term plan, as I have understood it, of the MathML group
was to rely on the implementation in mass market browsers of the type
of client-side processing that is associated with [12]eXtensible
Markup Language (XML), and, in particular, a type of XML that might be
called ``HTML extended by MathML (presentation tags)''.
The idea of XML is to make up your own HTML. The author or publishing
house makes up a set of tags. Then he, she, or they work very hard to
create ``rendering information'' about these tags in a ``style sheet''
language. A web-served XML document contains a reference to the
corresponding style sheet, which is also available, under a style
sheet mimetype, on the web. Browsers are supposed to be able quickly
to digest the style sheet information and then quickly render the XML
document. (The style sheet information may already be cached.) This is
the XML dream.
The first rendering efforts with MathML were applet-based and, I
believe, early MathML planning envisioned the creation of a mimetype
for ``HTML extended by MathML'' and the creation of an independent
rendering application (whether plugin or external) with specific
knowledge of this markup language. W3C's Amaya appears to have ``HTML
extended by MathML'' as its default language. (I don't know the
details of Amaya.)
The "<object>" tag approach to MathML probably is more sensible for
the long run than ``HTML extended by MathML'' if only because MathML
is so much more granular than HTML. If I think about type-setting
MathML, I tend to perceive that task as not any easier than that of
local direct setting of [13]Geoffrey Tobin's DTL (printable ascii
equivalent of DVI). The point here is that setting MathML is probably
too much to ask of native rendering by mass market browsers though it
is certainly in scale for plugins and external apps.
There is still an issue in the eyes of some, on which I am neutral, of
whether there is, or will be, a widely used style sheet language that
is rich enough to provide the desired level of rendering of MathML
presentation tags.
We need all of the good relevant plugins and external apps that the
community has the energy to provide. Still, because these make more
demands on the client side (than do ordinary browsers) -- demands that
are not reasonable in some places and situations that are and will
continue to be important -- we need to have a way to handle math on
the web in formats that are very different from paper or "windowing"
terminal displays without loss of ``content''. This is possible and
really not that difficult.
Even if one wishes to set aside the need for audio, Braille, indexing,
and searching streams, envision, for example, going as a visitor to
look up something on the web in the San Francisco public library. All
of the windowing stations are tied up. But you find simple terminal
(vt100) access to the network via the browser "lynx" at a station that
is available. It may be that the savvy library administrator has that
station there because he knows that it will give you a way to avoid
waiting. (In fact, if its processor is fast, that is almost certainly
true.)
In ``windowing'' situations it is not too much to ask for the
``mathematical typewriter emulation'' (MTE) standard in mass market
browser native rendering as part of native HTML. MTE is just emulation
of the mathematical typewriter prevalent in all mathematics
departments during the period 1960-1980. One had lots of symbols (in a
fixed font), one could underline, one could move the paper for crude
cursor positioning, one could make make something bold by re-striking
after a slight horizontal displacement. It was crude, but it preserved
content. Photocopy images of MTE documents were widely circulated as
informal publications.
MTE is more ``in scale'' with ordinary HTML than is MathML, which is
much closer to fussy typesetting.
All that needs to be added to basic HTML is:
1. the horde of character entities that we need (in scalable fonts
with algorithmic styling for bold, emphasis, and perhaps also
several forms of alternate-emphasis). Algorithmic styling is
desirable for efficiency even though it is less beautiful than
separate fonts; but, for that matter, rendered HTML is already
less beautiful than TeX rendered by "xdvi".
2. a simple element "<lg> ... </lg>" (logical group) with attributes
for horizontal and/or vertical cursor motion, described by a
numerical multiplier relative to the size of the current font,
prior to the display of the contents of the element and also with
attributes for horizontal or vertical stretching, again described
by a numerical multiplier relative to the size of the current
font. Client rendering support for stretching should be optional.
Client rendering support for positioning should be mandatory in
windowed displays and where that is not appropriate the protocol
should be to replace the opentag "<lg>" by the ascii character "{"
and the closetag "</lg>" by the ``balancing'' character "}". (An
attribute of the "lg" tag could be used to change the crude
rendering strings "{" and "}" to other ordinary string values
including empty ones. Attributes could also be used to furnish
hints to computer-algebra systems or to furnish the identity of a
MathML tag from which the current "lg" was fabricated. So MathML
could be reconstructed. Of course, all of this would be authored
in generalized LaTeX. :-))
3. elements "<math>" (paragraph level) and "displaymath" (block
level) in which
+ the new "lg" tag is permitted.
+ all character level things are rendered one at a time with
inter-word spacing except for the case of strictly
alphanumeric character level things inside "lg" tags
containing no whitespace, which will be assumed to symbols
that might be given "\mbox" treatment in LaTeX.
My understanding is that eventually the horde of characters and cursor
movement will be possible with "w3-mode" in [14]Gnu-Emacs under a
windowing display. (I do not know about algorithmic styling.)
Inasmuch as there are very few "vt100" terminals extant that are not
running in displays under local platform windowing systems, it is
reasonable that the scientific and text-processing communities join in
an effort to promote a broader collection of characters, cursor
positioning, and algorithmic styling in enhanced "vt100" terminals.
_________________________________________________________________
This document was marked up in [15]GELLMU
_________________________________________________________________
[16]AUTHOR | [17]COMMENT -- Auto-flowed to HTML: Mon Aug 17
11:03:07 EDT 1998
References
1. http://pages.nyu.edu/%7Emsh210/
2. http://www.w3.org/MarkUp/html3/CoverPage.html
3. http://www.w3.org/TR/REC-html32.html
4. http://www.w3.org/TR/REC-html40/
5. http://www.w3.org/Math/
6. http://www.webeq.com/
7. http://www.w3.org/Amaya/
8. http://www.alphaworks.ibm.com/formula/techexplorer
9. http://www.mozilla.org/
10. http://www.netscape.com/
11. http://www.mozilla.org/
12. http://www.w3.org/XML/
13. http://www.ee.latrobe.edu.au/%7Egt/tex-soft.html
14. http://www.gnu.org/
15. http://math.albany.edu:8000/math/pers/hammond/igl.html
16. http://math.albany.edu:8000/math/pers/hammond
17. mailto:hammond@math.albany.edu
------------------------------------------------------------------------------
I would be grateful for corrections and comments.
This text form was auto-flowed from HTML using "lynx -dump".
Other forms of the draft document are available at the URL
http://www.albany.edu/~hammond/gellmu/webm.html .
-- Bill Hammond [17]
Received on Monday, 17 August 1998 12:34:40 UTC