on parsing (repost)

The bracketed numbers are references to messages by number on the
mailing-list archive; the other bracketed references are given at
the end of this message.

-------------------------------------------------- superiors/inferiors
[020] Nico Poppelier writes:
> 5. Superiors/inferiors: I think we should add something about
> vertical position of inferiors in mathematics and chemistry (these
> differ, as some of you may know).

How do they differ?

---------------------------------------------- text within expressions
[020] Nico Poppelier writes:
> 7. In example 2 a lot of reserved words occur: from, to, of. How
> do you get these words as normal words in the text (in roman)?

I think text should be a separately-notated element, because the
meaning of the text can sometimes represent a function, a variable,
etc.  I thought about this problem when designing MINSE -- for
example, what if a statistician wants to write

     discarded samples
     total sample size

for instance?  Here the meanings of the words form part of the
expression, and should be part of the semantic tree.  MINSE treats
such text as a generic identifier, quoted thus:

    ?(discarded samples)/?(total sample size)

Text can thus be given a type, inserted into equations, and used
just like an identifier [TEXT].  But this clearly does not have
the same meaning as a multi-character identifier, and that's why
it is distinguished.

---------------------------------------------------- replacement rules
About the MINSE notation definition:

So far the notation definition has only included an operator
precedence and associativity list.  I considered augmenting
this to a list of more general replacement rules, but so far
i have avoided this because this can cause trouble when you
try to extend: adding an extra operator to or changing
an operator in an existing notation definition is okay,
because you can just give its precedence level and move the
operator.  But there is no way to know, when a new rule is
supplied, what to replace or remove, or where to insert
the rule in the ordering.  Still, if we find a way around
this problem, it should not be a problem to extend the
notation definition to include rules, and then we'd proceed
to implement rule matching and replacement in the parser.

With template matching that takes "integral from A to B of C
wrt D" to "integral(A, B, C, D)", we can allow the more
readable entry method and still achieve the proper structure,
but i am afraid that in more complicated expressions there
may be enough guesswork involved in wording the English-like
phrase correctly that it's no longer worth the trouble.  It
may be better to keep things simple.

Why does everyone seem to be avoiding a notation for just
directly specifying the arguments to an operation, like
"Int[A,B,C,D]"?  Is the current intention to express
*everything* just using operators and tree-matching rules
as in Dave Raggett's Prolog work?

----------------------------------------------------------- references

[020]:  http://lists.w3.org/Archives/Public/w3c-math-erb/msg00020.html
[TEXT]: http://www.lfw.org/math/syntax.html#text

Thanks for your time,