comments solicited on "semantics"

From: Ron Whitney <RFW@math.ams.org> Date: Sun, 14 Apr 1996 13:59:25 -0400 (EDT) To: w3c-math-erb@w3.org Message-Id: <829504765.932602.RFW@MATH.AMS.ORG> · This archive was generated by hypermail 2.4.0 : Saturday, 15 April 2023 17:19:56 UTC

Patrick and I have had some discussions on various aspects of
html-math, and a point on which I think there may be some merit in
general discussion is that of "semantics", as one may understand the
term in relation to mathematical notation.  A full definition isn't
required, but we all anticipate that html-math will feed into various
processors which manipulate the html data in mathematical ways, and
I'd like a somewhat clearer notion of how others here understand the
"semantics" we're trying to capture.  I realize that there may be no
one in the group claiming to capture "full" semantics (whatever that
could mean), but there is clearly an effort to capture something
beyond surface-level notation, and I'm trying to understand more about
what that deeper level is.  And, as in ordinary conversation, I expect
that there will be levels of semantical binding between the "free"
level of abstract notation and the final level at which the notation
is bound sufficiently for the application at hand.

There is a meaning of semantics in connection with, say,
mathematical logic, wherein one develops a "modeling" relationship
between statements in some formal language and the mathematical
structures which carry the same signature as the language.  In a formal
theory of rings, one might say in this connection that the meaning
of "+" is that + corresponds to the addition operator in each of
the rings of the theory.  So we haven't captured one thing, but many
of what we might call the "same" thing.  The "meaning" of the + then lies
in the abstract similarity of the operation across the class of models,
i.e. in the position of + within a formal theory of rings.

     (Meanings in mathematicians' minds do tend to jump
     among the various models involved and the formal
     theories which provide abstract means of unifying many
     disparate instances.  Computer algebra systems, in some
     sense, always reside at the symbolic, formal level insofar
     as they deal "only" with symbols and no infinitary objects.
     Whether human minds truly grasp infinitary objects might be
     another question, but here I'll call the semantical binding
     to a single mathematical entity a lower level than the level
     at which binding is made to a term within a formal theory.
     It's the theory level which may concern us most insofar as
     we're attempting to serve computer processing.  To further
     indicate my meaning, I'll say that a computer system deals
     with complex numbers, for example, as a formal system and
     not as a full mathematical entity.)

More analysis in the case of + would suggest that its use in rings is
actually based on its use as the binary operator in a commutative
group.  By parsing operands and passing a string such as "plus(X,Y)",
we let the next system in sequence handle the next level of semantics
(e.g. to bind the "plus" to complex addition).  In contrast, reading a
+ at surface level within the notation "k < n+" and analyzing it as a
postfix operator to pass "successor(n)" will better prepare whatever's
down the road to handle that usage.  And generally, we may expect
that surface-level html is unbound (free notation) and that inner-html
(the 2nd parse level) is still unbound, but closer to binding.  (Am I
actually saying anything?)

A more complicated case lies in the various uses of, say, ^ as a
"power" operation.  Here we have at least integer powers, rational
powers, the exponential function, iteration of maps, and topological
powers (i.e. cartesian products), each of which may use the
surface-level ^ operator.  More significantly, each of these uses
might be mapped to the same internal "power" operator for passage to
the next level of semantical binding.  The unifying force here is that
the same surface level notation is used and the later semantical
analysers may cater to that (or more rationally, may recognize the
surface uses as part of a single formal meaning).  The distinguishing
force is that these "really are" (say in a computational or other
formal sense) different uses of the term (e.g. one wants to say the
exponential function *truly* isn't a power at all) and should be
distinguished as such.  As much as we want to say that it does no harm
to distinguish at the first level since analyzers are free to map back
to indistinguished state, there is cost associated with making
distinctions (where does one stop?).

In the discussion to this point, if we consider a computer algebra
system, the semantical binding made by the CA system is done either on
the basis of some gross environment settings (e.g. the specific
commutative group in question may have been set to an additive group
of matrices as part of the import-environment) or on the basis of
"type" characteristics passed with the formula.  There are also
situations intermediate between these extremes (of global and
completely localized specifications) where the contextual theory may
require more involved specification before proper binding can be made.
(Let E be a certain elliptic curve and + its natural group operation,
then discourse about certain equations.)  When should binding be made
and how?  So in addition to the ambiguities of "power"-like
operations, there are situations in which one would only expect to
make the binding anyway at a late stage within the CA system and not
within any html analysis.  No?  (I'm generally worried about ad hoc or
parameterized operators within a paper and the transmission of their
proper characteristics to a computer algebra system.  The example
above regarding an elliptic curve may not be sufficiently complicated.
But then maybe I'm seeing complications where they don't exist.)

I'm certainly not claiming that any of this is news to people on this
list.  I would simply like to hear some further discussion of such
matters so that I have a clearer idea of what we envision.  Perhaps
the Illinois workshop is a better venue than this list for such
discussion (with feedback to the list afterward).

Patrick has pointed out to me that the "extensibility" of html-math is
not entirely clear to either of us.  I think we both understand that
certain surface level distinctions will be possible for altering the
2nd-level parse state achieved by the html analyzer, but it's unclear
to us how varied the paths to binding may be.  I'd also be interested
to know whether we envision computer algebra systems providing an
interactive end to this process, wherein queries help a user get to a
state of sufficient semantical binding.

Comments will be appreciated.

-Ron