- From: <juanrgonzaleza@canonicalscience.com>
- Date: Mon, 5 Jun 2006 06:36:04 -0700 (PDT)

Ian Hickson wrote: > > On Fri, 2 Jun 2006, White Lynx wrote: >> >> To summarize discussion on mathematics in HTML5, I would like to ask >> several questions. 1) Which markup do you think fits better in the >> scope of HTML5? >> a) >> <div> >> (X)HTML document may contain math formulae, like >> <formula> >> ax<sup>2</sup> + bx + c = 0 >> </formula> >> </div> > > This markup is completely inadequate to represent mathematics. For > example, it doesn't say whether "ax" is one variable or two. Markups of that kind are standards in academic publishing. They never considered completely "inadequate" in the way you are claiming. In Elsevier SGML DTD for mathematical articles above equation would be written in a very close way to George proposal. Look next example obtained from Elsevier technical documentation G(φ) = 2πr exp(i ψ) <f><rm><ssf>G</ssf></rm>(φ)=2πr<hsp sp="0.2"> <rm>exp</rm>(<rm>i</rm>ψ)</f> The default mode is italic for formulae (<f>) and <rm> introduces roman tokens. <hsp> introduces extra space and <ssf> introduces san serif fonts. Subindices and superindices both are introduces in similar way to HTML. For instance a^i b^j is encoded as a<sup>i</sup>b<sup>j</sup>. There are many possibilities; you can define a token mode (as TeX Elsevier Math or XL-MAIDEN) or introduce spaces (as Mathematica) or reusing <var> as was already noted here <var>a</var><var>x</var> > In HTML5 we have other options, too. For example, we could define a > special parsing mode Interesting approach but I believe that unnecessarily complicates the design of the spec and the implementation in browsers because would obligate to parse data in three different incompatible ways: HTML Math, HTML/XML, MathML. > > <math> > <mrow>a ⁢ <msup>x 2</msup></mrow> + > <mrow>b ⁢ x</mrow> + c = 0 > </math> Why do you introduce <mrow> instead reusing <span>? It cannot be confounded because is child of a <math> element: i.e. math mode. Probably you do not know but in April 2006 Robert Miner -from MathML IG- asked in w3c mailing list what would be changed in future MathML for doing it CSS friendly. Many changes to the current 2.0 specification were proposed. I do not understand why now we would reuse the same MathML specification is causing so many headaches to both developers and authors. We can learn from errors and try to do it better; I am especially interested in browser compatibility. Officially both Mozilla Foundation and Opera Software also are interested in backward compatibility with CSS, DOM, and HTML, as explained in their position paper (linked at the bottom of http://www.whatwg.org/). Therefore, I do not understand why the manifesto emphasizes CSS, HTML, DOM compatibility whereas you propose w3c code violating the three. For instance, you are claiming for the reuse of the <msup> element, let me summarize main difficulties with MathML script model (largely debated this year at the MathML mailing list): 1) The MathML model is not directly extensible because basis interference. 2) The MathML model introduces a different content model for each different script structure. 3) The MathML model, whereas being more complex (more content models and more tags) than the old script model of ISO 12083 standard, encodes less structures because tags cannot be combined. 4) The MathML model is not CSS friendly (some people sure me is not XSL-FO friendly) and is not DOM friendly. 5) The MathML model is not backward compatible with extended encodings people know very well and use such as HTML, ISO 12083, Mathematica, Maple, TeX, LaTeX, and others. The point 5) is also related to difficulties to write good TeX --> MathML translators. a^b can be easily parsed to a<sup>b</sup> but not so easily parsed to <msup>a b</msup>. In fact, many available parsers still offer wrong results at this point after 10 years of the born of MathML!! > ...with the DOM being the full MathML representation (namespaces, DOM, > and everything), also compatible with MathML weakness or is there room for improvement? > compatibility with an existing language, this would read ?compatibility with a ugly language is incompatible with CSS+XML+HTML+SGML+ISO12083 and is being largely rejected by both authors and developers even after 10 years of promises. The first mathematical language developed by w3c was HTML-Math in the draft of HTML 3. It was so full of errors and incongruences that was completely rejected by community. Would we copy w3c HTML-Math? not true? Next attempt was the MathML 1.0 also with lot of errors (fortunately corrected in next 2.0 version). Current MathML 2.0 contains several flaws still (specially in the presentational code); that would we do try to develop a new language more concise, powerful, and browser compatible or copy an unfortunate design? Luca Padovani in his 2003 PhD in mathematical formatting studied rendering of a simple matrix (2 x 2) equation and wrote <blockquote> By the MathML stretchying rules of operators, which were briefly summarized on page 23 [...] depending on the vertical extent of the sub-expressions x_ij , y_i, and z_i the parentheses may be stretched to different sizes, and the nice-looking outcome of rendering equation 1.1 is just a fortunate fact. A quick analysis of the MathML markup reveals that there is no way to preserve the structure of the formula and still have a "correct" rendering at the same time. </blockquote> > its renderers, That is, we recover all difficulties for rendering math in both on and off-line systems, including failures to implement MathML code in FO renders. Using *current* CSS rendering we can display lot of math in almost current browsers, without special fonts or plugins (this could be improved with best support for CSS or with future specific CSS enhancements). If we choose MathML, we can render _some_ math in Firefox and friends, and in MSIE when using a third party plugin (which is far from perfect). Interesting perspective! > and > its content, unambiguous interpretation, Curiously, last months we discussed many examples of ambiguous MathML code (extracted from real sites) in MathML mailing list. For example, what do you mean by this <mi>d</mi><mi>x</mi> or this <mo>d</mo><mi>x</mi>? > Currently this thread seems mostly to be > along the lines of "we should add maths, but we shouldn't make it > hard". I think that main idea is "we should add maths in a compatible way with the rest of satisfactory technologies available (i.e. without unneeded breaks), whereas we would not make it as unnecessarily hard as MathML does. White Lynx wrote: > > Thus price that browser developers have to pay for fractions is very > close to zero, so why not to make some mathematicians happy and > include fractions in HTML5? The same applies to nearly each and every > mathematical expression, so it is funny to have opportunity and not to > use it just because seven years ago someone at W3C decided to > "reinvent wheel, make it square and put the horse behind the cart". Good point! The w3c has been rudely critiqued by several of specifications developed. MathML is in the top five. Robert Miner (w3c MathML IG) was obligated to recognize that <blockquote> However, as I have observed again and again during the decade I've devoted myself to the issues of electronic mathematical communication, the principle challenges are not technical, but political. MathML is not the way it is exclusively because of language design considerations -- it is the way it is because it was the politically feasible compromise between the many conflicting interests of the consortium members that had a stake is standardizing a markup for math notation. </blockquote> Juan R. Center for CANONICAL |SCIENCE)

Received on Monday, 5 June 2006 06:36:04 UTC