Re: scripted operators

From: Bruce Smith <bruce@wolfram.com> Date: Sun, 21 Apr 1996 17:38:44 -0700 Message-Id: <199604211738.6837@uvea.wolfram.com> To: w3c-math-erb@w3.org · This archive was generated by hypermail 2.4.0 : Saturday, 15 April 2023 17:19:56 UTC

> Re: the discussion of scripted operators
> 
> I'm not sure whether Bruce was speaking tongue-in-cheek or not
> on this:
> 
> > (2) the subscript operator _ applies to terms, not other operators.

I was speaking seriously, but about our present proposals for
specific (well-understood) syntaxes for HTML-Math, not about actual
practice in notation.

> Can we carry precedence values into the parse tree and
> then prune them after parsing is complete?  We'd have to categorize
> _, ^, font changes, etc. to indicate that they're absorbed into operators
> when bound there.

I think we should probably do this if we can come up with a way,
but I don't yet see a clean way, because the very structures through
which the precedences need to be passed, are themselves built up
in a way which depends on precedence.

For example, if you've parsed, from the start of a line of HTML-Math,

	x_f

(and even if you know that "f" is a single token), you don't yet
know if some operator after f will have higher left-precedence than
_ has right-precedence, and therefore (if we imagine replacing the
term x by an operator, perhaps with some additional notation
indicating we're doing this), we don't know whether the precedence
of x should propogate only through the _f (and apply as we continue
parsing to the right), or whether some larger thing (starting with
f) is the subscript, and thus we should grab all of that before
using the right-precedence of x.

The point is that there is no well-defined (unambiguous) way to
use precedences, if we simply directly use operators in place of
terms.

What might be possible is some additional notation which would
allow this propogation of precedences to be unambiguous. For example,
if operators being embellished, but intended to remain operators
with the same precedence, are surrounded by one kind of brackets,
say <moe>...</moe> (for math operator embellished), then embellished,
then surrounded by another kind of brackets to turn them back into
an operator (say <mo>...</mo> for "math operator"), then it would
probably be possible for the parser to understand this soon enough
to apply the precedence further in the right way.

An example of this notation would be (with extra unnecessary spaces
for clarity)

	a <mo> <moe> + </moe> _ n </mo> b

to represent something that would appear something like

	a +  b
	   n

and where the entire source token sequence "<mo> <moe> + </moe> _
n </mo>" takes its precedence (left as well as right!) from that
of the "+" contained in "<moe> + </moe>".

I think even making this work for left precedences might be possible,
provided the notation could be detected and parsed after only
tokenization without requiring precedence parsing to have occurred
yet (as is possible when the brackets used are begin-end tags).

BTW, I'm not saying that >mo> and <moe> are the best choices of tag names
for this; I just made them up for the purpose of illustrating the idea.