- From: <juanrgonzaleza@canonicalscience.com>
- Date: Fri, 9 Jun 2006 02:57:56 -0700 (PDT)

About markup model. It has been proposed a specific markup language (Whyte Lynx) and now variations from Michel Fortin and some other. I agree with most of Lynx?s one but think we would not propose semantic markups. Even Content MathML or OpenMath are not still sufficiently thought as for broad adoption. In fact, both markups contained several serious errors in the past have been corrected in recent versions. OpenMath is more solid than Content MathML and probably latter never was adopted in practice. Today implementation of both is close to zero. I think that markup would be more easy possible with posibility of ampliation, such that better authors could do a better job but average users could easily obtain results in a cheap and rapid way. I would disacourage any semantic markup and a focus only in structure (markup) and presentation (CSS with XSL-FO as second choice). In that case something like <frac> <num>b</num> <den>2</den> </frac> is structural. Somewhat as a html document is composed of head and body, here I am encoding a fraction (structure) is divided into numerator (substructure) and denominator (substructure). Next structure can be styled with CSS (or XSL-FO also) applying different styles to <num>, <den>, and <frac>. There will be a default stylesheet (as in LaTeX) but fine tuning of some special fraction can be achieved using CSS rules directly somewhat as one can use special style attributes for finetuning of some part of text, but in general one uses defalt stylesheet for HTML. Okay. MathML is only presentational not structural -there is not explicit numerators or denominators- just a <mfrac> and two childs. It is not CSS friendly, and for modifying fine-tuning of some fraction you may use special MathML attributes, a new styling language, etc. This is not rational. I would not encourage usage of a type attribute. A simple class would be sufficient and then we can reuse available CSS and HTML engines. The implementation of full semantics in browsers would be very, very complex and nobody has proved that using type=?matrix? or type=?vector? the semantics can be unambiguosly encoded. The forcing of special types for each token would be so boring as if HTML text use a type attribute in span, p, and divs for each posible semantic one can imagine: lemma, chapter, manifesto, group, afilliation, booksection, etc. For that detailed markups as ISO12083 or Docbook are better in text, OpenMath or some other approach will be better in math. I find also rendering difficulties. For example in physicochemical comunity <var type="matrix">X</var> would be render in bold *X*, but <var type="matrix">X</var><sub>2,2</sub> I would use normal rendering for because is a matrix element, moreover <sub>2,2</sub> has not clear semantic meaning and is mixing semantics with presentation; at the poor you would use <sub type=?index?>2,2</sub> but at the best each index would be independently encoded and the comma typed as separator of indices. But all of that is complex. <var class="matrix">X</var> and a CSS rule would be cheap with <var>X</var><sub>2,2</sub> in non-bold face. I find boring also the double markup at Content MathML 2.0 with type=?matrix? being used sometimes but <matrix> in others. In MathML above example would be retypped via <selector> operator <apply><selector/><ci type="matrix">X</ci><cn>2</cn><cn>2</cn></apply> For any array structure ?matrix, vector, determinant or any other- why do not simply reuse available HTML elements: <table>, <td>... instead proposing new ones <mr>, <md> doing the same? CSS rules could use different selectors body> table: CSS rules for text tables body> formula> table: CSS rules for text tables fences could be done as <fence left="round" right="square">expression</fence> I would disacourage usage of <bounds>, <integral>, <product>, and all those. They are semantic, when we would focus on structure more presentation. Some constructs proposed here are CSS unlikely and from a semantic point of view not correct at all. Look for next code submited to this lists. <integral> <bounds> <sub>0</sub> <sup>100<sup> </bounds> 3<var>x</var> d<var>x</var> </integral> This mixes presentation and semantics. The usage of external containers as <integral>, <sum>, <product>, was done in previous versions of MathML code but abandoned in recent proposals. Math on the web began with insane code as <root>2<of>x</root> and finalized with current MathML 2.0 proposal: presentational part being CSS and XSL-FO unfriendly, and the content part also rejected. In fact, the encoding for something so simple as the integral sin (x) have changed three or four times due to weakness of markups proposed, with last Content MathML 2.0 being rudely critiqued from some comunities, for instance OpenMath one. As illustration compare above code for integrals with HTML 3 Math, with MathML 1, and in content MathML 2 with most recent OpenMath code for the integral of sin (x) <OMOBJ> <OMA> <OMS cd="calculus1" name="int"/> <OMBIND> <OMS cd="fns1" name="lambda"/> <OMBVAR> <OMV name="x"/> </OMBVAR> <OMA> <OMS name="sin" cd="transc1"/> <OMV name="x"/> </OMA> </OMBIND> </OMA> </OMOBJ> Ian Hickson wrote: > I would be very cautious about introducing an entirely new language to > do this (even if it is "just" an extension of HTML4). For something as > big as Mathematics, we want to simply re-use an existing language, not > invent a new one. Inventing a new language for encoding content with as > wide a problem-space as mathematics would require months, as well as > the time of domain experts, etc. This work has already been done, e.g. > in ISO12083, MathML, LaTeX, and other such languages. Nobody want reinvent the wheel, but people reuse languages when these *work*. By reusing MathML one finalizes with an ugly language is not compatible with rest of w3c technologies, semantically incorrect (or at least incomplete) and practically nobody want waste time with it. You look like a fervent admirer of the re-use of MathML, however some of your proposal, such as special parsing mode or mixture of pure and mixed content were proposed by other people (e.g. Juan R.) and completely rejected by the own w3c MathML IG people even before begin a serious debate. Moreover, by reusing MathML 2.0 we are reusing exactly the same errors that w3c MathML IG did with its last specification. For example, why the own MathML IG did not reuse (all of) MathML 1.0 and on the other hand proposed a new mayor revision with important changes? Why did the own MathML IG decide to invent new tags such as <apply> instead reusing MathML 1.0 code as <fn>, <reln> and others? Why did the own w3c not reused <min> and <max> for integrals? Why were <of> and <left> and <rigth> and <root> and <over> and several other tags not reused from the early w3c Math draft of 1994? Why was the initial <EXPR> Math tag finally abandoned? Do you know? When mathematicians stated that TeX would be reused in XML (most of mathematicians and just users of TeX and know little of internal design and less still of web and XML requirements) Neil Soiffer ?one of w3c MathML authors- replied <blockquote> Which part of TeX? TeX is not amenable to the growing number of XML tools such as CSS, XSLT, DOM, parsers... </blockquote> Therefore it is perfectly reasonable our rejecting of MathML because ?is not amenable to the growing number of XML tools such as CSS, XSL-FO, DOM, parsers...? This debate would be about Whyte Lynx proposal for mathematics in HTML5 rather than discussions about the reuse of MathML. However, since it may be informative explain why MathML is not popular. I offerede many examples if incorrect code, incorrect desing, criticism and others. Now some additional comments about MathML Reference NSF / NSDL Workshop: Scientific Markup Languages Workshop Report Hosted by the National Science Foundation June 14-15, 2004 Report prepared by: Laura M. Bartolo, Kent State University Timothy W. Cole, University of Illinois at Urbana-Champaign Sarah Giersch, Association of Research Libraries Michael Wright, UCAR ? DLESE Program Center **** EXTRACT, comments for mine between [] ********************* Initial work on MathML predates even the formal release of XML as a W3C Recommendation and draws on early experiences from SGML (e.g., the ISO 12083 Mathematics DTD fragment) and HTML (e.g., the abortive effort during the development of HTML version 3 to augment HTML with a number of math specific elements, attributes, and constructs). [...] There remain, however, a number of substantive issues with regard to MathML. As one of the very first domain-specific implementations of XML, there were (necessarily) growing pains, and MathML is still seen as somewhat experimental by many potential users in the math community. [...] MathML is therefore recognized as inherently incomplete. The authors of MathML have explicitly targeted it for the expression of mathematical content up through the early undergraduate level (first-order calculus). Its utility for research mathematics, even with its explicit built-in extension mechanisms (e.g., as exploited in the EU funded OpenMath project), is still uncertain. MathML is also intentionally bimodal, containing sets of elements to describe separately the presentation of mathematics and the semantics of mathematics. Generally, early implementers have focused on one or the other but not both parts of the ML, resulting in asymmetrical implementations that don't always interoperate as well as might be desired. Adoption has been somewhat slow, _in part_ [emphasis mine] because of the entrenchment of TeX within the research mathematics community. Additionally, although mathematics is recognized as key to many scientific disciplines, and there have been some attempts to incorporate or accommodate MathML markup rules within other domain-specific markup languages, there are examples of domain-specific markup languages (outside of pure mathematics) that include their _own markup semantics_ [emphasis mine] for basic mathematics needed within the domain of interest, rather than borrowing from MathML as needed. [...] The mathematics breakout discussion included a diversity of MathML experts and current and would-be users and consumers of MathML. This diversity of backgrounds and perspectives made for an energetic and wide-ranging discussion. In discussing the potential benefits that MathML might bring to bear on educational services and models of learning, there were multiple points of consensus as well as several open issues and uncertainties identified. [...] That said there remain several open issues as well regarding the potential of MathML to help meet educational needs for a better way to express mathematics in online documents and learning resources. The utility of MathML to enhance searching and improve accessibility of online mathematical content has not yet been proven. Searching of mathematically laden content by the mathematics it contains is a complex issue. It's not altogether clear whether the level of description implicit in content (semantic) and/or presentational MathML is sufficient to support robust searching on the mathematics contained in a resource. It's also not yet certain that readers and other accessibility tools will be able to exploit MathML effectively to make the mathematics embedded in a resource more accessible, though that seems a safer bet. While MathML is being adopted (at least experimentally [this was the case at the Center for CANONICAL |SCIENCE) sure]) behind the scenes -- e.g., as an exchange format for interoperation between applications like Mathematica and Maple and in the editorial workflow of scholarly journals [I studied with detail the case of the giant Elsevier and they are using an in-house modification of MathML instead the w3c standard, because they also got problems and, moreover, they are complementing the usage with own mathematical CEP markup e.g. <ce:sub> and <ce:sup> for simple formulae], it has not been widely adopted by the authors of educational and scholarly mathematical content. Research mathematicians continue to rely heavily on TeX, which though exclusively presentation oriented (really a specialized language for the typesetting of mathematics) is firmly entrenched. Educators continue to rely on cruder technologies (e.g., embedding mathematics as static images within HTML or presentation only markup within PDF documents) or exploit proprietary solutions such as Mathematica workbooks. There remains a bit of a "chicken and egg" problem in that authors are hesitant to adopt a new technology until it has proven its value, and it remains difficult to prove the value of MathML without a sufficient body of MathML content. Discussion of this issue led naturally into an extended discussion as to how MathML is now or might in the future engage the mathematics community. It is clear that MathML at this point in time is more appealing to organizations or institutions than it is to individual practitioners. As a non-proprietary, expressive, comparatively low-loss way to represent mathematics, MathML has clear attractions for long-term archiving and interchange of mathematics on a large scale. Hence its attractiveness to publishers and middleware tool developers. Several participants in the breakout session suggested that MathML may continue to develop as a largely or even exclusively back-end technology, used behind the scenes as a way to store and exchange mathematical content, but not necessarily as a format with direct impact on the author's or the end-user consumer's experience interacting with mathematical content [this clearly indicate that WE would provide a cheap but powerfull mathematical language for the web, with end-users and authors in mind]. That would still make MathML useful, but the consensus was that MathML's greatest potential both economically and in terms of new functionality will not be realized until it is used more widely by content creators and ultimate consumers [the problem is that specification is weak and available tools cannot generate first-class MathML code]. This will require even more aggressive development of necessary authoring and presentation tools (including interactive presentation tools) and the inclusion of MathML within markup schemes developed by other science and technology communities that require the ability to express rich mathematics in documents and learning resources. This, in the collective opinion of those participating in the Mathematics breakout discussion, suggested avenues of common interests with other markup language communities represented at the workshop and led to the identification of several key issues of importance to the further development and future evolution of MathML: ? the need for more ubiquitous, more transparent (to the user) support for MathML in the Web environment; ? the need for better support within XML and Web-based applications for "compound documents" (i.e., as defined by the W3C, documents that combine multiple formats, such as XHTML, SVG, SMIL and XForms); ? better assurance that MathML will be maintained as a standard going forward; ? more sophisticated tools, especially on the authoring side, that can facilitate inclusion/embedding of MathML within online resources (e.g., within Web pages); ? continued development of better, more robust transformation tools (e.g., between TeX and MathML); and ? viable business models to better support and encourage ongoing development of MathML. *********************************************************** A simple and cheap structural markup based in ISO-12083 (which is an international standard, MathML is not) that can be styled with CSS (or XSL-FO) and that do not need of special fonts, plugin, native support, special tools, etcetera would be easily implemented and accepted by authors. Juan R. Center for CANONICAL |SCIENCE)

Received on Friday, 9 June 2006 02:57:56 UTC