- From: William F Hammond <hammond@csc.albany.edu>
- Date: Wed, 17 Sep 2003 07:53:29 -0400
- To: W3C MathML Discussion <www-math@w3.org>
Brian Osserman <osserman@math.mit.edu> writes in reply to Robert Miner <RobertM@dessci.com>: > >Many projects involving MathML involve converting legacy data into XML > >format, and one of the most important legacy formats in this context > >is TeX. > > Does the phrase 'legacy data' here imply that you expect that MathML will > eventually replace tex as a primary data format? If so, how do you envision > this happening, given MathML unsuitability for direct authoring? In my observation almost current all math article authoring is presently done with TeX, and the largest portion of that is, in fact, LaTeX. To see this download some article sources from ArXiv (http://www.arxiv.org/). (Somebody at ArXiv must, in fact, have actual statistics, but I don't.) Robert Miner replies: > As to direct authoring, I presume you mean writing code with a text > editor. While most folks working with MathML have been primarily > interested in graphical authoring, it is a simple matter to define a > terse language and compile it into MathML. After all, that is the > model of TeX itself, compiling a various macro languages into DVI. > It's merely a shame that TeX syntax is not normally regular enough to > be particularly well suited to going to XML + MathML, as witnessed by > the weakness of current TeX -> XML + MathML converters. Yes, although an author who develops a consistent, well-structured way of using LaTeX can expect fairly good results. The idea of characterizing definitively what is a well-structured kind of LaTeX in a rigorous way is, I think, a blind alley. > But it really is a triviality to come up with a language as terse as > TeX that maps directly and unambiguously to some XML + MathML doc > type. For example, just changing <foo>...</foo> to \foo{...} and > adding some default tokenization rules (that can be easily overridden) > makes authoring MathML comparable to authoring TeX. Someone used to LaTeX would want more than just a language for math "islands" in a larger document. For a smooth interface with the existing practice of authors, one wants something reasonably like LaTeX with math that is reasonably like math in LaTeX. For new documents this is what the GELLMU project, http://www.albany.edu/~hammond/gellmu/, is about. It seeks to provide an XML document type that is as close to LaTeX as reasonably possible, given the goal to have, in addition to standard LaTeX translation, a well-defined bullet-proof translation to (1) ordinary HTML using stripped TeX-like notation for math such as one might see in email and (2) XHTML + content MathML for perusal in new browsers. As explained in my 2001 TugBoat article, which appeared early this year, an author may use LaTeX-like notation -- including newcommand definitions to generate articles in the XML document type. At this point everything is in place except (2). Toward (2) I have a variant of the HTML 4.01 translator (written in perl for SGMLS.pm) that writes XHTML 1.1 with UTF-8 encoding. I estimate that it will take me between 20 and 40 weeks, in blocks of 5 or more weeks of undivided attention to do (2). Along the way it may be necessary to provide a few more attributes for math elements in order to control non-default cases in translation to MathML. Some such attributes are already in place. Unfortunately, it will not be until May 20, 2004 before I can presently forsee another window of time. Meanwhile if someone else is interested in undertaking a translation from the XML form of the GELLMU article document type to XHTML + MathML, I'll try to assemble an up-to-date tarball. Writing a translation to XHTML + presentation MathML would be easier than writing one to content MathML. If browser fodder is all one wants, that would be the way to go. -- Bill
Received on Wednesday, 17 September 2003 07:53:31 UTC