- From: <juanrgonzaleza@canonicalscience.com>
- Date: Sat, 27 May 2006 10:10:42 -0700 (PDT)
Henri Sivonen <hsivonen at iki.fi> wrote: > I am pretty convinced that the granularity of markup needed for math > and the verbosity of XML necessarily lead to an XML syntax for math > that is not suitable for direct human authoring. I doubt that. > However, I think it > does *not* necessarily follow that an XML syntax for math is an > inherently bad idea. I said not the contrary. I said is that specific MathML format is not good enough due to political issues. The weak design is even recognized by own MathML authors. > Math even more than schemas or vector graphics needs to have an XML > syntax, because math needs to integrate in prose on a more profound > level than e.g. replaced elements would allow. I do not agree; the current MathML is not really integrated with XHTML. A HTML syntax is sufficient for the structural part. With standard HTML <sub> and <sup> and a bit of assistance from a standard 2.1 CSS, one can render complicated scripts structures in a standard HTML browser without need for introducing the MathML soap <msub>, <mstyle>, <msup>, <msubsup>, <mmultiscripts>, <none>, <mi>, <mo>, and <mn>, the extra MathML attributes, the extra DOM, the extra styles, a different WS parser or the special namespace in your document. Similar thoughts when one decides reusing old but effective HTML <table> element instead of adding new redundant ones: <mtable>, <mtr>, and <mtd>. The true debate is on the content part, which is not solved with content MathML. >> 1) Insanely complicated and inefficient. In some cases, I have >> computed 15 >> times more bandwidth and server storage when using MathML than >> alternatives. > > gzip 1) First I would note that I was not talking of verbosity of XML end tags or similar, but of the inefficient markup model specific to MathML. Have you tried to encode E=mc2 in full parallel MathML? And what about fine parallel markup? Fine parallel markup is so complex that even the own Math WG provided an alternative code. They did not recommend gzip to users ;-). 2) It does no sense to offer people gzip archives of online documents for downloading and reading off-line! 3) Even using compression, one may unzip files before accessing to data. I cannot manipulate a file of 15MB in my computer when the *same* information could be stored in 1-3 MB. The w3c has done a big effort on providing us lightweight rational alternatives to old insane approaches. A typical example is the usage of a simple CSS external document for all your HTML documents instead of repetitive encoding of font style in each paragraph of each document. MathML just break this tendency providing one of most ultraverbose and redundant encodings I have seen in my life. Another example. I can write <p>This is an <i>important</i> text</p> in presentational HTML. Rationale is simple but effective, text is roman by default and when text is italic you markup with <i>. The code in a MathML fashion would be typed like <mp><mr>This</mr> <mr>is</mr> <mr>an</mr> <mi>important</mi> <mr>text</mr></mp> That is, you redundantly says to computer that each token is roman with ?important? being rendered as italic. The same information but bloated. However, since presentational markup is not likely, above <i> is better encoded as <em> in HTML. Next <em> is already rendered as italics by default but I can change rendering via i) CSS in the head of document ii) external CSS is used by several documents at once iii) Special CSS rule add via style attribute 4) Still using gzip other approaches are less expensive in both disk and bandwidth. > How is MathML not compatible with the DOM? Introducing specific DOM model does implementation in browsers mainly impossible. MathML is not integrable with rest of browsers technologies as DOM, CSS, and WS model. All this generates problems and headaches to browser developers and reason of real failure to see browsers with native support of MathML. There is a lot of technical details in Opera browser developers site on why they rejected *native* MathML support. FO developers also failed to provide unification of MathML with XSL-FO. Is this way? The manifesto for HTML 5 emphasized <blockquote> Web application technologies should be based on technologies authors are familiar with, including HTML, CSS, DOM, and JavaScript. </blockquote> and <blockquote> Basic Web application features should be implementable using behaviors, scripting, and style sheets in IE6 today so that authors have a clear migration path. Any solution that cannot be used with the current high-market-share user agent without the need for binary plug-ins is highly unlikely to be successful. </blockquote> MathML does not fit in this philosophy, therefore may be abandoned. >> Well MathML is not really based in those. But we can render math using >> just HTML, and CSS, and we can use JS and DOM in the same way we use >> in >> HTML or CSS for text. Look XML-MAIDEN >> [http://www.geocities.com/csssite/index.xml] for ideas, samples, etc. > > Interesting. However, the results have the look and feel of a > afterthought math editor for a word processor rather than the look and > feel of pdfLaTeX output. The look and feel are better that with MathML. The markup is lightweight and can be easily accessed and modified via standard DOM and CSS rules. Moreover, rendering is incremental and thanks to recent advances by George many browsers can see almost all mathematics, whereas MathML support is a kind of binary logic: or you can see math (Firefox, MSIE + plug in) or you cannot. Moreover, the MAIDEN markup can be transformed to TeX for printing via TeX engines whereas better CSS-based printed engines are not ready. Do not forget that those articles are generated with a couple of simple CSS 2.1 rules *without* font metrics information (TeX cannot do that and relies on specific collection of fonts are not designed for web and in a very complex formatting engine). Fine tuning in the web can be achieved complementing the generic XML-MAIDEN stylesheet with more rules for special cases or with fine tuning CSS rules directly inserted in the document. It is not difficult to achieve TeX output quality when using font metrics. Improved rendering engines and more experience with CSS in this field would provide a better rendering quality. However due to difficulties for implementation of MathML in browsers, it will be difficult that you can obtain fine tuning of formulae someday. And, of course, it is close to impossibility that you can provide a TeX engine for the web (one of reasons TeX has failed to conquer the web). >> One of problems >> with this approach is that once new STIX fonts available I can use >> them in >> HTML, also in CSS rendering of math, but I cannot use them in firefox, >> since MathML module would be rewritten, and the full engine >> recompiled, >> obligating to users to download and install new versions of browser >> for >> new fonts!!! > > The PUA mapping is indeed a problem. If you want to see a change here, > I suggest creating an OpenType font that uses the Type 1 > outlines from the YandY version of Computer Modern and has proper > Unicode mappings. I prefer to follow usual web design guidelines providing rendering engines and technologies were independent of the fonts installed at the client side. But this does not impede implementation of specific rendering printer engines dealing with a collections of predefined fonts in specific domains. For example a library could implement a printing engine optimized with font metrics for the STIXs. CM no thanks! >> 4) The default printing of MathML is not good and people is >> returning to >> TeX for that! > > In general, Knuth was over 20 years ahead of everyone else. CSS-based > typesetters are still catching up with TeX on some things. (And the > bar is pretty high.) No. It is relatively trivial to provide TeX quality in different markups when one knows font metrics. In fact, one can see several authors providing TeX quality with SVG and even with HTML approaches when font metrics are known. However, nobody has provided TeX quality using just MathML. Only do I note the paradox that neither SVG nor HTML were designed for mathematical rendering? This low quality has obligated to people to translate MathML to TeX and next print formulae with a traditional TeX engine. The really difficult problem is to provide good typesetting quality without rely on specific fonts; Knuth has not solved this still ;-) > But yes, if you want to print math, pdfLaTeX is the best thing > around. Changing the syntax of MathML does not help in catching up. > Improving the rendering engine does. No, the problem of MathML is in its syntax and content model. Both are incompatible with the rendering engines of browsers and as previously said also publishers using XSL-FO in books and documents were unable to incorporate MathML in the rendering-print engine. HTML Math could be incorporated in a few days because it an incremental implementation. >> 5) Accessibility is very deficient > > A different syntax won't help. Implementations of accessibility tools > will. False. Alternative syntaxes for mathematics already proposed are more accessible than MathML by several technical motives. Moreover, it has been proven that current implementations of MathML in browsers are not detailed enough for that accessibility tools can work. This is reason that MathPLayer audio rendering does not deal correctly with tabular data and that generates some ambiguous readings. Ambiguous rendering are absent when one uses old GIF model + ALT. Ambiguous rendering could be eliminated if providing a new approach in the future HTML. By encouraging usage of p-MathML in HTML 5 one is generating inaccessible code. >> Moreover, the situation is still poor than that! Many sites claiming >> theoretical accessibilities (e.g. Distler blog) are serving (ds)^2 as >> <mi>d</mi><msup><mi>s</mi><mn>2</mn></msup>, i.e. 2s ds!!! > > I'm pretty sure Distler doesn't claim his math to be accessible, and > I'm pretty sure he is quite aware of the paradox that AsTeR does not > support MathML even though its author was on the WG. > > http://golem.ph.utexas.edu/~distler/blog/archives/000199.html Distler does lot of different claims. In the accessibility statement [http://golem.ph.utexas.edu/~distler/blog/accessibility.html] says ?Equations are written in MathML 2.0.? And do not explain that accessibility of their self-proclaimed ultra-advanced blog is poor that if had used old HTML + GIF + ALT model (or using PDF or even LaTeX). For instance Distler does not explain to his readers that when using ALT attribute in a GIF the image is more accessible that when using the verbose MathML code it is generating and serving on the Internet. Perfect p-MathML 2.0 code is not really accessible, but still poor, MathML code is being served by Distler is structurally invalid and based in tricks. For example, he is using <mrow> and tricky collections of <msubsup> for simulation of tensors. He does not use invisible operators introduced in MathML. He encodes prescripts as in TeX via empty groups instead using <prescript> tag. This is odd! The question is why one would encourage usage of MathML in HTML 5 when it is doing poor that old approaches! We can offer accessible math in next HTML 5. There is not such one paradox! MathML does not work. Accessibility of MathML is just a myth. >> 6) There are problems with default rendering of entities > > XML entities on the Web are b0rked. Since MathML is not human- > writable anyway, let's get rid of the entities. Curiously the problems with MathML entities are solved in other approaches. I already solve that some time ago... >> Accessible code render ugly in screen whereas >> visually correct code being inaccessible. > > "Accessible" code is just theory, right? No. >> 7) The possibility for automated searches of math continues being >> largely >> a myth. > > Many, many things related to searchability, internationalization and > accessibility are myths in the realm of semantic markup. Then this is argument for not usage of MathML in HTML 5, point. However, you just are systematically failing to understand main points here. Take the case of searches. If anyone encodes E=mc2 in HTML, I can search it with Google and works very well. But I cannot search the formula when encoded in MathML (presentation, content, or parallel). There is ?infinite? ways to encode same formulae because specification is weak and full of technical holes. I did an experience with a simple dot{q} in most popular MathML tools. Each MathML tool encoded the same TeX command in different ways. Moreover code generated by two of MathML tools failed to be rendered by Mathematica 5.2 when directly copied and pasted. I can copy \dot{q} from any TeX doc and paste it in any other TeX doc and TeX engine will work. I can copy E=mc2 from any HTML source and copy it in a HTML editor and will work. Using XML-MAIDEN or ISO 12083 inspirated approaches the encoding would be far more uniform and, therefore, ready to automated search by some engine. >> 8) Visual rendering is not incremental as in CSS. This can offer us >> problems with large documents or even with server failures. I find >> just >> curious the w3c emphasis on abandoning non incremental rendering of >> old >> HTML presentational table layout models in favour of CSS layouts, >> whereas >> forcing usage of a non incremental MathML presentational markup. Some >> mathematical documents take order of 10 minutes before rendering in >> Firefox. > > This is not a design problem with MathML. This is Mozilla bug #18333. Is _that_ bug related to the 10 minutes, to that MathML rendering is not incremental when compared to a CSS solution, or what? Moreover, you may know that the problem of the bad MathML support on Firefox (and other Gecko based browsers) is not due to the low-quality of programmers but to the explicit incompatibility of MathML with previous HTML and CSS layout models. However, I can promise you that implementation of the same mathematics in an alternative math approach could be implemented in current browsers in less than a pair of days. Again MathML does not fit with the Position Paper I cited previously. >> 10) Advantages of being using a "standard" vanish when one observes >> the >> infinite malleability of mathml code. For example people is simulating >> tensors with nested msup, msub, msubsup, and tricky mrows, instead >> using >> <multiscript> and <none/>. Then hypothetical standardization >> advantages >> are lost. > > Yeah, MathML is presentational in practice. No, because both visual rendering *and* structure of those formulae is incorrect. (Presentational MathML was also designed for structuring mathematics as you know). The problem is in the MathML design again. For instance, ISO 12083 -using less tags- was able to encode more scripts structure than MathML 2.0 can. The interesting is you would add one or two new tags to HTML for a full rendering of tensors. Since all people would use same tags, the encodings would be standardized. Moreover, you could use <sup> for both text and math. >> 12) The use of presentational markup is contrary to common sense. > > Is LaTeX contrary to common sense? Which looks better a LaTeX > printout or a Mathematica printout? You are failing to fix the point. My full message was >> 12) The use of presentational markup is contrary to common sense. I >> write <H1> in HTML and next I said one -and only once- in a CSS how the >> heading may be rendered in my doc. That CSS, when stored externally, can >> be called by billions of others HTML docs. In MathML you are forced to >> repeat presentation in each formula in each document, to use mstyle... >> >> The use of a presentational language for mathematics remember me the old >> days of the <font>, <b>, <i>, <center> tags. Little impact of MathML in >> the web remind me the failure of XSL-FO to conquer the web. Instead >> specific presentation MathML markup complemented with lot of <mstyle> >> tags I would prefer semantic or structural markup. Precisely LaTeX substituted TeX presentational markup by default stylesheets and macros with emphasis on content. You apparently misguided the point that HTML was semantic markup, after transformed in presentational markup by big developers (the nightmare of <font>) and recently retransformed again in semantic markup with presentation best done by CSS and elimination of <font> and family. Similar approach was taken in ISO 12083, with structural markup in SGML and styling via DSSSL. However MathML adds presentational tags and styling markup directly *to* the document which is contrary to common sense. Precisely also future Latex 3 is being mainly improved in the stylistic part with emphasis on copying SGML and HTML models. For example, LaTeX 3 interface will support DSSSL specifications and style-sheet concepts such as those used with HTML and XML. I see no reason for repeating here errors done in the past. MathML is not way. >> A) Eliminate next text from specification >> >> "Authors are encouraged to use MathML for marking up mathematics" >> >> because authors would use more concise powerful and solid markup for >> mathematics. > > -1 at least until an alternative is implemented and deployed in UAs. MathML is not alternative at all as 10 years verified. Any other alternative approach could be implemented in browsers in a few days, because one can reuse working HTML, CSS, and DOM. Moreover advantages of this kind of approach I am suggesting are applicable beyond mathematics. For example a better support for the standard CSS block-inline is crucial for mathematical rendering. But a better support of the rule benefits also rendering of other kind of documents. A better support for <mroot> in MathML only benefits mathematical documents prepared using p-MathML. >> C) A more complete approach is providing a set of structural and/or >> semantic tags for usage with HTML5. > > Scope creep. ????? >> One needs little tags, because <sub>, <sup>, <var> and <table> can be >> reused. > > I don't believe that considering that vast feature set LaTeX needs to > provide. Unfortunately, LaTeX design is erratic obligating to introduce billions of redundant commands, each one with different syntax, content model, etc. A typical example is LaTeX redundancy between \frac and \over for fractions. Nothing of that is needed in XML-MAIDEN or in ISO 12083, the international standard for mathematics on SGML. It is clear that a combination of tags more powerful CSS rules more Unicode is all one needs. For instance, amstex package provides you a special command with two attributes for placing indices in certain roots. This non-modular approach is unnecessary in SGML/XML/HTML. A stylesheet (CSS, DSSL, XSL-FO) generates default rendering but you can do fine-tuning of position of the index via standard CSS rules applied to special element (you can modify colors, stretching, kerning, heights, baselines, margins, paddings, positions, etc. and many more things LaTeX cannot via CSS). In amstex you need a special command for vertical align of index in the root, in CSS you could use the standard vertical-align command you use for rendering any other text. But again apple and oranges. I am explaining that one can provide a better support for online mathematics than using presentation MathML just by addition of a few tags to the future HTML. Your appeal to LaTeX appears a bit off topic and in any case is not relevant for not considering my proposal of avoiding MathML as mathematical markup. > -- > Henri Sivonen > hsivonen at iki.fi > http://hsivonen.iki.fi/ > Juan R. Center for CANONICAL |SCIENCE)
Received on Saturday, 27 May 2006 10:10:42 UTC