I won't have time to fully peruse Ron's sample HTML-Markup, or Ping's comments on it, for at least the next few days (and then I'll be out of email touch for a week or two), but I thought I should mention a couple of points that caught my attention when I skimmed Ping's response. (I have not yet read all the later letters which may also touch on some of these issues.) (I apologize that it's been taking me so long lately to read email on this list. I have been drawn away by other urgent work. I hope that I'll be able to correct this problem soon.) First I want to mention again an important point to avoid some possible misunderstandings: the letter outlining the main parts of my HTML-Math proposal for Wolfram, http://www.w3.org/pub/WWW/MarkUp/Math/WG/Smith-960531.html is intended to supersede all prior letters from me or anyone at Wolfram in the mailing list; also, our proposal does not necessarily include elements discussed by others in the list unless that letter (or some subsequent statement from me) specifically says it does. It seems to me that some of these prior letters may be the cause of some misunderstandings. For example, there is nothing in our proposal about transformation rules which try to support "English-like syntax" or which work at any other stage than after the parser, and on the expression tree generated by the parser. The only kind of transformation rules we propose are the kind discussed in that letter.) Also I should say that I've finally found time to read most of Ping's web pages on MINSE, so now I can understand his comparisons with it. I have to say that it is an impressive and well-described system. It does differ in several important ways from our (Wolfram's) proposal, which I'll address mainly when I reply to the letters concerning those points; for now I should mention the most important distinction, which is that his system is primarily "semantic" and ours is "notational", which means (among other things) that the information our systems are each trying to represent is quite different. Many of Ping's specific points of comparison are manifestations of this general difference (as I'm sure he understands). At 11:16 AM 7/8/96, Ka-Ping Yee wrote [excerpted]: >4. In many situations multiple comparisons are written in a chain, > which happens here under the "max" compounds with "0 <= i <= T". > How does your notation deal with this and the issue of operator > associativity? This is described in my letter; here are the excerpts which pertain to the case of relational operators: The parser groups a term with the adjacent operator which has the higher precedence (assuming it is being used in a form which takes an operand on that side). If these precedences are equal, it groups the term with <em>both</em> operators; <p> ... <p> The same feature of grouping a term with both adjacent operators is used to allow certain operators to have "flat" or "n-ary" associativity, e.g. + and &InvisibleTimes;. This is what causes the source text "4ac" (in the example given far above) to parse to a single (mterm ...) subexpression containing three subterms (which are mn and mi tokens for 4, a, and c) separated by two (invisible) operator tokens. <p> ... <p> Sometimes, more than one operator has the same left and right precedence; this is true, for example, of relational operators, so that sequences of inequalities turn into single subexpressions even when (e.g.) both < and <= (or &LessEqual;) are used in the same sequence. <p> Thus, according to the above (and to the proposal), "0 <= i <= T" parses to (mrow (mn "0") (mo "<=") (mi "i") (mo "<=") (mi "T") ) I'm sorry if this was not sufficiently clear from the proposal letter. I should probably add a specific example involving relational operator chains, since they are perceived as different from the examples I gave of the same parser feature, which were matching brackets and n-ary operators. (Of course, they are *semantically* different, but not in a way which our proposal, which is mainly notational, attempts to capture.) >9. Ron marked up "absolute value" using &leftvert; and &rightvert; > (Ron's point 7). How is the grouping ability of these symbols > declared? .... By declaring (in the operator dictionary) these special characters to be (one-character-long) left and right bracket operators. >13. Parens are also used for all sorts of meanings in this example > (Ron's point 6 in the TeX posting), and i think it's impossible > to tell the difference between the interval "(0,&infinity;)" and > the pair "(ν_1,ν_2)" the way Ron has it marked up. It is > also very unclear when the parens indicate function application, > as in "ν^ε(x,t)". > > This is all distinguished in the MINSE markup using different > compounds. Function application is the only case implied just > by the parentheses; the compound "openopen" is used for writing > an interval open at both ends, alleviating this ambiguity. This is an example of the general difference between a notation system, like ours, and a semantic one, like MINCE. The notational system doesn't attempt to distinguish between the various meanings of the same operators or identifiers, except in a few important cases which normally affect the rendering. This has the advantages that the system's designers (or the authors of "contexts") need not make a list of all concepts to be discussed, and the authors need not look up the correct named concept to use; but it has the disadvantages that the renderer can't choose to render different concepts with the same conventional notation differently, nor is the very valuable semantic information represented (in an easily or unambiguously extractable form). There has been quite a bit of discussion of the relative merit of each of these approaches, and the present consensus of the HTML-Math group is that the notational approach is better for HTML-Math. We do, however, hope to get the "best of both worlds" to some degree, eventually, by giving the proposal author-extensibility so that authors have the *option* of defining and/or using constructs which carry additional (possibly semantic) information. (I'll address the issue of why I'm sure we can do that well enough when I reply to Ping's letter about Extensibility. I hope to have time to do that before I go out of touch for 1-2 weeks, but I'm not sure whether I actually will.) >------------------------------------------------------------- discussion > >On the whole, i think i'd have to say that the proliferation of >homonyms in Ron's example makes me rather uncomfortable. Parens, >superscripts, and juxtaposition have so many different meanings >in the HTML markup he posted that -- even if it were possible for >mapping rules to choose which meaning is intended -- i don't think >i would just trust the rules to pick the right one every time, and >guessing exactly how to appease them by manipulating the notation >would quickly get troublesome. I would much prefer getting into >the habit of consistently saying what i mean instead of hoping >that it gets interpreted right. Of course, per my comments above, in our proposal for HTML-Math, no attempt is made to automatically disambiguate these homonyms in an HTML-Math renderer. If a CAS wants to try to do that when it reads the HTML-Math, that is up to it. (And when we add author-specified contexts, that will be in large part to make it possible for authors to make this job easier for the CAS. But we won't require them to, in contrast to MINSE or to Roy Pike's proposal.) >Moreover, what if authors later want to define new meanings for >juxtaposition or parentheses? There seems to be no provision for >this because the juxtaposition itself is used to figure out the >meaning. Again, the meaning, in general, is never figured out at all. It's true that we make a few exceptions to this, e.g. in deciding whether an implied infix operator should be a times or a function application, because that so commonly affects the rendering, but it is not hard for authors to always insert these operators explicitly if they want to override this automatic decision. (And when our proposal is made extensible, it will be possible for authors to "change the rules" for this; but that's beyond the scope of this letter.) >I think it makes more sense to go the other way, i.e. >from the meaning to the notation instead of guessing the meaning >from things like juxtaposition and parentheses. But it is also ambiguous to go from meaning to notation, as well as from notation to meaning -- there are many possible notations for the same meaning. Assuming that authors want to influence the notation chosen, this is another advantage of a notational system. (Of course, I admit that a semantic system can include some notational information, just as our notational system includes a bit of semantic information; and I also admit that authors want to influence not only the notation used, but the meaning inferred -- at least I hope they do :-). In other words, I don't claim (or believe) that "our approach is good and the other is bad", but rather I think that which one to take, and to what extent, is a matter of judgement about the best tradeoff of various factors, given the uses that a representation is aimed towards.) - Bruce P.S. I think a couple of the group members (Neil and Dave) might recall that when I first joined this group, I was strongly in favor of an approach which, though notational, was much more like MINSE that our current one, in which there were named compounds for each "notational operation" (such as surrounding an expression with parentheses). In that system, a notation like [0,1) would need its own compound, just like in MINSE. Neil very eloquently converted me away from that approach by patiently explaining some of its disadvantages, like the difficulty of representing nonstandard notations or syntactically- incorrect expressions, and the need to invent an endless stream of new notations (names of compounds) for things which already have universally recognized standard notations (albeit sometimes-ambiguous ones). Also in his favor was the very elegant (and to me, very surprising) way in which he had been able to get Mathematica to parse integrals, where the integral sign and the differential-d are operators with specific (carefully-chosen) precedences like any other operator (which has been preserved in our proposal for HTML-Math).Received on Tuesday, 16 July 1996 17:21:55 UTC
This archive was generated by hypermail 2.4.0 : Saturday, 15 April 2023 17:19:57 UTC