Re: semantic markup for math

Ka-Ping Yee (kpyee@aw.sgi.com)
Wed, 24 Jul 1996 14:51:58 +0900


Message-Id: <31F5B9FE.2C67@aw.sgi.com>
Date: Wed, 24 Jul 1996 14:51:58 +0900
From: Ka-Ping Yee <kpyee@aw.sgi.com>
To: Thomas Breuel <tmb@best.com>
Cc: www-html@w3.org
Subject: Re: semantic markup for math

Thomas Breuel wrote:
> 
> That's the point: using semantic markup for math requires the semantics
> of formulas to be defined for each and every field of mathematics that
> wants to publish on the web.  You seem to think that that is easy.
> I think it's exceptionally difficult.

I didn't say it was trivial; i merely think that is a reasonable to be
able to put something like that together for most general use, and that
it is reasonable to expect people to be able to gather and make up what
they need and what is useful.  The only premise i have taken is that,
if you are the one communicating, you know best what it is you want to
say, and you will be capable of pointing that out.
 
> The argument that MINSE is extensible doesn't help: if "semantic markup"
> for math has any utility at all, it requires that people agree on
> the semantic markup, not make it up on the fly.

These are two separate issues.  Yes, of course it is better for the
concepts common to lots of people working in a certain field to be
given agreed-upon bindings; but the extensibility doesn't *prevent*
that.  It allows this to happen.

(The on-the-fly case is still sometimes useful when an author is 
making up an ad-hoc notation for individual uses.)

> By adding semantic markup, you are adding a completely new set of
> requirements and burdens to authors.

Possibly some.  Yes, it is not identical to TeX.  But i think what
you're missing is that with MINSE people get a *choice*.  You get
to decide how much meaning you want to convey.  I am trying to
encourage people to raise the bar a little bit by starting off
with the kind of context definition i have now, but it doesn't
mean that you don't have the option to go with something that has
more meaning or less meaning.  You wrote:

> I just don't see how you can say that.  LaTeX formulas certainly
> contain enough information for "deployment on the web".

MINSE doesn't prevent you from presentation-based expression,
if that's good enough for your purposes -- if, for instance,
the renderer covered all the TeX rendering schemas (and i think
it's not particularly far off as it stands), you could make a 
context that was practically isomorphic to Tex, if you wanted.

But a presentation-based language gives you *no choice*.  You
have no way to put more meaning in your document.

The huge advantage to using an appropriate context definition
is confidence.  If you write something in TeX, you never know
for sure that it is going to convert to the right thing in a
symbolic math package, or display correctly in a textual browser,
or sound right over the phone or to a blind person.  But if you
use MINSE, all you have to do is choose a context for which a
matching style exists for the target medium.  Then you *know*
it will come out correctly.

Say you choose a context, and the context has a style definition
for Maple.  Then everything defined in the context will work in
Maple.  If you add something that could be a problem for Maple,
you'll know, because it isn't part of the context; or you can
add your own style rendering for it if you know what you want,
and then you're okay.  Not so for a typesetting-based language:
the conversion is all *guessing* in that case.

> By "can't" I'm referring to the fact that the semantic primitives for
> the user's mathematical specialty are missing, not that the
> user is too stupid to figure out how to.

Then they can get together and define some.  Yes, this is a task.
But the problem is not any different -- in fact, i think it will
be *easier* -- than trying to figure out how to convert the TeX
that represents their notation into something which can be
understood by (e.g.) Mathematica.

And if they didn't want any semantic understanding?  They could
choose a context that wasn't limited by the restriction that it
has to have a rendering to (e.g.) Mathematica.

Moreover, the flexibility of a typesetting notation language
is limited by the size of the libraries included with it.
You are stuck with a fixed set of internal symbols, constructs,
and macros; if you try to add external things, you have to
trust that so-and-so's particular version of the typesetter
also has them.  And so you end up trying to distribute something
that is quite large, and you are still forced to draw a line at
what gets in and what stays out.

> That's an interesting example, for several reasons.  First of all, it
> isn't really semantic markup, but a strange mix between semantic markup
> and layout.

What makes you say that?

> For example, you couldn't automatically tell from the
> notation which variables are scalars and which ones are vectors,

I think there is sufficient information to tell.

It depends if you know what the notation means.  The point of the context
definition is to tell you what the compounds mean, and that includes
what compounds imply things about their arguments (e.g. function
application implies that its left argument is a function, dot product
implies that its arguments are vectors).

> you couldn't automatically translate the integrals into some other
> common notations,

Why?

> I can only say
> that your semantic markup isn't very semantic after all but a kind of
> variant of LaTeX notation that uses different quote characters and
> function call notation instead of infix notation in some places.

Uh-uh.  LaTeX doesn't come with definitions of what each macro *means*.
That's the whole point here.

If the context definition is bad, we can work together to make a good
one.  It's the *concept* of applying a context-style pair to a semantic
tree that i am proposing here, and i think it is sound.

> defining and deploying semantic markup would be a huge
> undertaking, and I fear going down that path would put standardization
> of any kind of mathematical markup for web documents on indefinite hold.

I think i've made a reasonable start already.  The system i have,
though admittedly not "all together" in the sense that all the
applications and conversions have been tried, *works*.  Of course
progress is slower than it could be -- no one is helping me.  I
can't claim to be able to define contexts for all of math; that
is for other mathematicians.

But i don't call that putting anything on hold!  It can render
equations now, for multiple browsers, unlike anything on the Web
has been able to do for all the years the WWW has been around.
(Except for WebEQ, and that's if you are willing to wait for
about a minute for each formula, and write about five times
as much source, and force Java on your viewers, and if you want
zero extensibility...)


Ping