Re: SGML version?

Richard Obeirne <richard.obeirne@blacksci.co.uk> wrote on
Mon, 22 Jun 1998 14:34:01 +0000:

> Anyway, after all this, I'm still left with one problem, although it's 
> probably not for this newsgroup. I'd like to call in the MathML DTD into an 
> existing SGML DTD (for science journal articles) so that we can tag maths 
> using a standard and not a proprietary ad-hoc tagging scheme, but as it uses 
> a different declaration it causes a conflict with the parent DTD.

MathML is rather granular.  One might say that it is near the bottom
of the markup chain.  My guess is that the sgml language for science
journal articles is near the top of the markup chain (suitable for
human authoring and with many options for entirely automatic
conversion to various presentation formats).

Whether or not the particular sgml language for science journal
articles is near the top of the markup chain, we DO want to have
markup languages near the top of the markup chain that include math.
For now it seems reasonable to look for such markup under the sgml
umbrella although it should be possible for it to have a LaTeX-like
look and feel.  (Most legacy LaTeX documents will not be
auto-convertible into such a markup.  On the other hand, between the
two editions of Leslie Lamport's book on LaTeX, there was a migration
of recommended use toward the top of the markup chain.)

An example (that does not, however, claim to be part of a working
system) is the LaTeX document (for the custom documentclass "tdsguide")
"ftp://ctan.tug.org/tex-archive/draft-standard/tds-0.9995/tds.tex".

This document can be auto-converted to a Texinfo document using the
author's ad hoc "tds2texi.el" in the same directory.  (Certainly,
Texinfo documents are near the top of the markup chain.)  Easily
available presentation formats for the document include (1) dvi,
(2) html, and (3) info.

Technical:  Unlike with MathML, which like HTML-4 has thousands of
character entities, with an SGML markup near the top of the markup
chain one probably wants to deal with non-ascii characters at a more
abstract level so that one is not bound to one particular character
scheme for all presentation formats.  That is, instead of the
character entity "&gamma;", one probably wants to use an empty element
"<gamma/>".  If this is the case for a "near the top" markup under
SGML, then a DTD for that language will be happy with almost any
standard SGML declaration.

(Technical)^2:  With SGML processing toward a presentation target
these empty elements would probably become character entities.
A disadvantage to this approach is that one may want monocase element
names; hence, "&Gamma;" becomes "<ucgamma/>".)  (Comment, folks?)

                                   -- Bill Hammond

Received on Saturday, 27 June 1998 15:01:37 UTC