I'm posting this note to restate and make a little more explicit (perhaps only in my own mind) what some of our issues are. This isn't meant to sidetrack current conversation, and I'm trying to keep the text below to a minimum. This is also not meant to reinterpret previous position statements or override them. I'm only providing another framework which is understandable to me, and attempts to place the 4 "positions" with which we have some relation --- the Wolfram proposal, Ka-Ping's MINSE system, Roy Pike's Math DTD work, and the OpenMath Consortium's efforts --- in a common context. This is only meant to be a set of statements off which others may or may not bounce, not a position paper for the HTML-Math ERB. I welcome corrections, adjustments, contrary opinions, ... -Ron ----------------------------------------------------------------------- BROADLY ------- We (the HTML-Math ERB) are attempting to devise a notation for mathematics (I recognize Ping's interest in broadening our notational target, but here will not move beyond mathematics) which can be automatically rendered to visual and audio formats (at least one of each) and for use by various computational engines as well. The visual format is the sine qua non here, audio seems achievable as long as we avoid some pitfalls of systems designed purely for visual rendering, and more in doubt is the degree to which we can support rendering to computer algebra systems, theorem provers, and other more specialized software for dealing with symbolic, numeric, and knowledge-retrieval problems. My understanding is that we are attempting to solve a set of problems somewhat smaller in scope than those addressed in the OpenMath Consortium's Objectives document. (This OpenMath document does give a good overview of problems and objectives. My emphasis here is slightly different, and I don't consider that this little description summarizes the views of the OpenMath Consortium.) There may well be questions regarding the degree to which we can break off a part of the mathematics interchange problem, and we should be open to discussion on this. We certainly don't want the efforts here to be in conflict with those of OpenMath. Consistency and symbiosis, if not identity, should be important in our considerations. It is apparent that, although visually-oriented notational systems (such as printed form, or code, such as TeX, to produce print form) are efficient carriers of information, their effectivity is highly dependent upon human facility at interpretation, and such notational systems are quite ambiguous to software written now (and probably within the next many years). When software is not capable of disambiguation, humans must do so in the notation, to a degree seen as necessary. Thus, printed mathematical notation must be disambiguated and regularized to some degree in order to serve the purpose of machine readability and interchange. On the other hand, ambiguity has its place (and very seriously so) in the guise of abstraction from details which do not require further specification for purposes at hand or foreseen. This is simply in the name of prioritization and efficiency. The free market of printed mathematics decides on its own when efficiencies of notational commonality are merited (this occurs within individual papers and within fields of common discourse). As mathematical ideas are generated and sifted, notation is introduced and settles to more stable form over time. As new ideas enter, the mode of expression for old problems and theorems also changes. And there is great cost in developing software which deals with some set of mathematical issues, using a certain concrete syntax and semantics in these particular areas of mathematics. Although computerized realms of mathematics are ever-widening, there will always be advantage in speaking in an informal, abstract and machine-ambiguous way about realms not yet machine treated. Thus, I do think there are avenues of mathematical discourse where there is a strong need to "speak" with the simple printed (implying also "spoken") word, to not require the high level of disambiguation necessary for machine treatment. The path of disambiguation has a cost which must be considered. And even aside from poles of traditional, printed (machine-ambiguous) mathematical notation and machine-processable notation, there are problems of inconsistency among those systems which treat the same material on the same level and might be intertranslatable. These inconsistencies in syntax and semantics block or increase the cost of moving between the systems automatically. OpenMath sets out to solve the intercommunication issues on all levels. HTML-Math is trying to present a notation sufficient for visual and audio display, to which it is possible to translate from other visual display languages (such as TeX and the existing ISO 12083), and from which it is possible to derive other notations sufficient for computation. In speaking of solutions to interchange problems, we should keep in mind (at least in background) the scale on which a solution works effectively. All parties recognize the patched and non-global natures of notational systems. OpenMath speaks of lexicons, Pike of domains. Notations and their conventions or "contexts" range from a formula (or subformula) to paragraphs, articles, books, and beyond (lexicons). Insofar as the OpenMath and Pike approaches aim for large-scale intertranslatability, it seems not unreasonable to me that HTML-Math differentiate itself to a smaller scale. ----------------- MORE SPECIFICALLY ----------------- The Wolfram Proposal -------------------- The Wolfram proposal for HTML-Math suggests that we use traditional mathematical notation (i.e. something akin to code for printed notation) as our basis, and parse this notation via operator precedence tables and bracketing to an "expression tree" which is to be the fundamental structure from which other forms are derived. The notation provides a means of altering precedences and adding new identifiers and operators. Visual and audio rendering should be directly achievable from an expression tree. Conversion of expression-tree data to forms appropriate for computational engines will be done on the basis of template- or pattern-matching maps. (So semantics may be "attached" to the notation by adjoining or pointing to a collection of pattern matches.) Bruce and Neil (and I believe Dave) have expressed confidence in the fact that template-matching will achieve the semantical mapping we want. Ping has expressed doubts (I believe based on the fact that ambiguities in expressions will get out of control as the range over which templates are matched grows, and that template-matching will become unreliable). Raman also spoke about wanting to determine the extensibility mechanism early on, and I believe he was concerned about the degree to which notation would be transformable. I find the issue very difficult to judge myself. I have a vague idea that the template-matching is to be something akin to the various pattern transformation schemes I see in the Mathematica Manual, but I don't have a good idea of details or an idea of the scope of operation. I think more conversation in this area would help me, also recognizing Bruce's recent statements regarding priorities. I have neither the confidence to say this is a "go", nor the knowledge to provide counterexamples. Perhaps we would all benefit if someone would throw up a good set of concrete examples (say, with traditional notation appropriate for translation to computational engines). I think this is at least appropriate by the time of our October meeting. I do like the late semantical binding afforded by the Wolfram proposal, and its emphasis on notation rather than semantics. Ping's MINSE notation --------------------- MINSE has much in common with the Wolfram proposal. MINSE also starts with traditional notation, but allows augmentation with "compounds" which are locally defined operators. Again the fundamental structure is an expression tree parsed from the notation, but in this case, ambiguities are removable through attentions of a "good" author who incorporates appropriate compounds. As Ping has noted, MINSE notation can lead to cleaner translation because it may have been disambiguated by the author. The approach here allows a semantical element to enter insofar as authors may specify their own sense of semantics via compounds. My own reservations about MINSE spring from what I think may be analogous situations in TeX documents where "good" authors have sometimes produced notations which are very hard to handle by third parties. Despite a listing of locally defined compounds and what they "mean", one may still have some more difficulty adding to or mapping from the new notation than one would have with some more commonly-seen notational forms. It is also unclear to me to what degree authors will "be good". AMS authors have had opportunity in TeX for quite some time, and they have been terrible at it. This is undoubtedly because there is no compensatory reward for being "good". Pike approach ------------- The Pike and OpenMath approaches are more semantically oriented than the current proposals for HTML-Math. Pike is attempting to define DTDs for areas of discourse (perhaps corresponding to the lexicons of OpenMath). This provides uniform notation within each area and may be achievable in certain areas, but does not seem very suited to areas in which concepts and notation are in flux or locally tuned. So, e.g., I foresee little chance of AMS authors using such DTDs for their journal articles. OpenMath approach ----------------- This is clearly the project of largest scope among those mentioned here. The problems with which HTML-Math is concerned are sub-problems of the general one of full mathematical intercommunication. OpenMath proposes to solve the large problem by implementing layers of communication protocols, the top layer comprising semantical realms, and the next-to-top notational (or expression) realms. I believe that OpenMath views most expression interchange as being mediated by the semantical realms. I have little quarrel with the general list of objectives or the OpenMath layered model, but the project does seem vast to me and I wonder how achievable it is. I also need to hear more detail about the thinking of those involved. As mentioned earlier, I suspect we may define our problems and objectives in such a way as to be of smaller scale but of greater tractability than those of OpenMath, all the while listening to whatever progress and concerns the Consortium has to report.Received on Friday, 12 July 1996 01:05:37 UTC
This archive was generated by hypermail 2.4.0 : Saturday, 15 April 2023 17:19:57 UTC