[Prev][Next][Index][Thread]

preliminary response to Hammond



Following is a response to Bill Hammond of the EMJ list.  I'm using
this primarily to elicit reaction from ERB members, and doubt that
such extensive detail is appropriate for a real posting to Bill.  I do
think I'll use this as a starting point to a letter to Bill, though.
Please post reactions if you want or we can discuss it all Monday.

-Ron

AMS is also experiencing some mail difficulties.  I suspect my postings
go out, but I haven't received mail from the erb list today.


=========================================================================

Bill,

I'll make some general remarks on the HTML-Math ERB work, then address
some of your points.  It's unclear to me what has been said to you in
regard HTML-Math (perhaps someone has forwarded to you some of my very
own statements!).  You're clearly concerned about possible misdirection
of effort toward functionality beyond that of viewing simple
mathematical text, and, in connection with this, possible delays in
producing a specification.  I'll go further into these issues below.



The current goals we've listed for HTML-Math are:

<UL>
<LI>Is suitable for teaching, and scientific publishing.
<LI>Works with symbolic and numerical math applications
<LI>Supports filters to/from other math formats, e.g. TeX
<LI>Is easy to learn and to edit by hand
<LI>Is well suited to template and other math editing techniques
<LI>Can be rendered to:
    <UL>
    <LI>graphical displays
    <LI>speech synthesisers
    <LI>plain text displays e.g. VT100 emulators
    <LI>print media, including braille
    </UL>
<LI>Support lengthy expressions via fold/unfold and line breaking
    with author control.
</UL>


Following is a brief overview of my sense of the committee's work.
I'll place some distance between my own views and the final
consensus-to-be of the committee, just because consensus statements
can be rather vague and they take a long time to produce.  I think I'm
not far off the mark, and welcome comment and criticism, but don't
want to claim that all committee members have agreed to these
statements.  We're working at a great distance from one another.

Currently our preference is to use something akin to "traditional
notation" for source input.  This view may only be of concern to those
who use a vanilla text editor, but it also suggests that source text
can be much like plain TeX insofar as the latter is also what one
might term traditional notation.  We don't have plans to support TeX
input directly, but we do believe it's crucial to provide display
access to those who wish to continue using TeX.  The TeX community is
large and an important part of the mathematical community.  We believe
that filters mapping "simple" TeX to HTML-Math will be easily written.

The HTML-Math parser will transform the source text to an "expression
tree" based on operator-precedence values and grouping (much as with
TeX's {...} structure).  The expression tree is the fundamental
underlying structure from which all other HTML structures are derived.
One of these structures is a display-list format from which the text
may be viewed.  We have also talked about direct access to
display-list format so that those who wish to avoid the
operator-precedence parsing may do so and still achieve display of
formulas.

Other "renderings" (e.g. to voice) and mappings (e.g. to the notation
of a computer algebra system) will also be achievable from the
expression tree.

Generally, we feel the approach is to cater to the traditional
notational base for written mathematics, but also to enable further
uses of the notation by (a) allowing for some more detail in
"expression" and (b) mapping of the notation after the expression has
been parsed.  We don't have in mind new requirements as to the degree
to which "structure" *must* be encoded, but only that some provision
be made for those who do wish to use traditional notation in a way
which may be connected to other machine-readable notations.  We see
our efforts as consistent with those of products such as Scientific
Workplace, techexplorer, and the objectives of OpenMath.

There's much yet to do, but we are still targeting the end of this year
to send a first draft to the W3C.

--------------------------------------------------------------------

Response to your points:

> I would like to see the reviews on MathSciNet available in HTML-Math
> without the need for other-than-standard browser processing.

We're concerned that math at this level also be easy viewable.  I
don't intend to speak for all AMS members, but my feeling is that the
AMS constituency agrees with this, and AMS staff representatives do
voice this point of view in our meetings.


> Markup isomorphic to a narrow dialect of pageless LaTeX is about
> right for that.  If the ERB agrees with this point of view, then
> part of the ERB's task might be to set that dialect at the same time
> that the details of HTML-Math are being set.

I don't know exactly what pageless LaTeX is, but I can imagine.  This
does leave me uncertain as to how much math capability you include in
the notion.  There's an amsmath package available with LaTeX2e.  Might
pageless LaTeX include this?  I gather *not* from your succeeding
remarks (since you appear to want to `keep it simple'), but I'm
uncertain as to the level of macro extensibility you're allowing.
None?

I believe, as I mentioned above, that the HTML-Math ERB is concerned
that simple notation be directly enterable or other "isomorphic" forms
be filterable to the standard.  Also, as mentioned above, we aren't
actually intending to specify an "HTML-TeX" acceptable set of
control-sequences ourselves.


> I think that the March '95 draft for HTML-3.0 was pointed in the right
> direction.

I think our current approach is a good, well-thought-out extension of
the capabilities of the draft you mention.


> I think that work on "structure" should continue, but it will easily
> go beyond what is reasonable for the mimetype "text/html", which is
> the basic web document.  So I hope that the incorporation of
> HTML-math into HTML is not held up by work on additional structure
> intended for processing by tools other than web browsers.

"Structure" is open-ended.  We have in mind something akin to
"notational structure" as opposed to "semantical structure", although
I can't claim a very clean boundary between the two.  I think we "see"
3 somewhat coherent levels of "structure": display, notational, and
semantical.  The first two are closer to one another than the third.


> We need to have exposure at the basic level of the web for the sake
> of the public relations of the entire mathematical and scientific
> community.  To have wide exposure one needs not only processing done
> by standard tools but also *fast* processing.  User impatience is an
> impediment to exposure.

I agree.  There's enough disagreement in how to treat mathematical
notation that I can also see it being developed in a quick and dirty
way.  We're trying to do something which achieves a useable end and is
consistent with other tools of those who employ mathematical notation.
I'm open to rational discussion.

We feel we've found a good point of attack in "notational structure".


> Remember that there are limits to the level of "structure" in HTML.
> For example there is no sentence container tag.  More rigorous
> content-based markup would provide for that.

I take it you're alluding to some SGML-like "structure".  It's true
that mathematics can be an infinite hole of semantical structure.
We're heading for notational structure.


> The needs of non-visual users are important because many of the most
> important users will be non-visual information-gathering robots.  I
> think that some of the criticism of LaTeX-level markup in this
> regard fails to take due notice of math-mode.  If the markup is not
> too complicated, one might hope that browsing tools will enable
> users to write search-strings in that language.

Are you speaking of search strings for mathematical notation?  I
suspect that if mathematics is to be searched, one will either want
semantical cues (so that commonly understood domains and functions are
named conventionally), or one will want a notation understood by some
engine (such as a CAS) which can "reduce" or "normalize" expressions.
We think the latter is achievable with the simple, "traditional"
notation we will propose.


> Published mathematics is never without ambiguity except for the
> context supplied by its audience.  Symbolic manipulation programs
> can supply context, but it is not easy.

True many times over, and I've even been known to say this myself.  I
prefer to encourage "late binding" of semantics and have it done by
those who know their targets.


> My suspicion is that the goal of having markup that can be fed into
> symbolic manipulation programs, if it is to accommodate all such
> programs, is very much more complicated than most imagine.
> Regardless, such markup is not appropriate for incorporation in
> documents under the web's basic mimetype "text/html" since those
> documents are to be processed only by web browsers.

It's hard to say who is doing the imagining here, but I do think our
committee members take the complications involved very seriously.
I'll admit to being a fuddy-duddy myself, and I'm not actually a CAS
user.  On the other pole, I think we're all aware that the realm of
computable things is very large, that there's a lot of work which can
be done, and that people will actually keep widening the world of what
is computed.  And with this in mind, it's also clear that
computational semantics raises itself only slowly.  Operations and
abstractions need implementation.

One fiction of some SGML proponents is that mathematicians should
"just write down what they mean".  I'm sure we're all aware that life
is much more complicated than this.  We do need a means of presenting
notation for interpretation to a reader or listener, but we can also
look for ways to enable more machine-readable connections.


-Ron