From: <juanrgonzaleza@canonicalscience.com>

Date: Sat, 15 Apr 2006 08:21:06 -0700 (PDT)

Message-ID: <3310.217.124.69.238.1145114466.squirrel@webmail.canonicalscience.com>

To: <www-math@w3.org>

Date: Sat, 15 Apr 2006 08:21:06 -0700 (PDT)

Message-ID: <3310.217.124.69.238.1145114466.squirrel@webmail.canonicalscience.com>

To: <www-math@w3.org>

Romeo Anghelache wrote: > > juanrgonzaleza@canonicalscience.com wrote: >> Romeo Anghelache, >> >> It is rather surprising that one can claim that HERMES is generating >> semantic content, when articles generated from HERMES looks like >> >> --------------------- REAL CODE >> <…> >> <p> >> </p> >> <h3>2001-07-09</h3> >> <p> >> </p> >> <p class="abstract"> >> <p> >> <span class="fn"> </span><span class="fb">Abstract </span><span >> class="fn">We review the present status of black hole thermodynamics. Our > .... > >> some unresolved open issues. </span> >> </p> >> </p> >> <p> >> <…> >> >> ----------------------------------------------------------- >> >> Is the use of empty paragraphs for simulating layouts, headings of level 3 >> for encoding dates, and others points you mean by "semantic"? >> > > you didn't read the user manual, a single page, at http://hermes.roua.org/ Let me introduce the heading of the single page you are citing <quote> Hermes - a semantic XML+MathML+Unicode e-publishing/self-archiving tool for LaTeX authored scientific articles </quote> with Last update on Friday, March 31, 2006 by Romeo Anghelache. i ask again is above code you mean by "semantic"? > The document generated by Hermes is a raw XML file (the reference > document, or the library document). What you are talking about here is > the result of a stylesheet transformation, a stylesheet that I wrote > just to put the things on screen. > The only semantics on the screen I'm concerned about is the looks of it, > and the fact that you can copy/paste the math in your math application. > The h3 you're complaining about may have been class="date", but it has > its unintentional usefulness: it catches the know-it-all guys. And do you consider correct the encoding authors or dates as headings of level 3? It is rather incompatible with w3c focus of last decade. The w3c has done a big effort on recommend splitting of content from presentation. The use of presentational tags as <i> instead of <em> is not encouraged since years ago. The encoding of authors or dates as headings of level 3 is still poor! You are claiming *now* that incorrect code is not HERMES code, just a stylesheet that you (personally?) wrote. Ok, let us suppose that I was completely wrong and failed to understand. 1) The incorrect encoding continues there, and if it is not error of HERMES -just of the "personal stylesheet"- the final code is being served to final users (including people with disabilities) continues being wrong. 2) One of reasons that I notice authors when I am critizing/revieving their work is that if I am wrong they can correct me. You did not reply then and now are replying in a hard way here. 3) Was I wrong? Let me cite another page, titled *Hermes at work* [http://hermes.aei.mpg.de/] and next to quote a bit <quote> This page lists results (or links to results) of Hermes assisted conversion of scientific articles/books from the (La)TeX world to the XML world, it is an online storage facility kindly offered by Max Planck Institute for Gravitational Physics. </quote> Then, next I click on the first living review (Rovelly paper) [http://hermes.aei.mpg.de/1998/1/article.xhtml] and since I am curious, I see the source code once my browser opens the document and from the <head> section I extract: <meta name="generator" content="Hermes, version 0.9.4 2005-11-19, license GNU GPL, description http://hermes.roua.org/"/> therefore, I can perfectly to say on Canonical Science Today [http://canonicalscience.blogspot.com/2006/02/choosing-notationsyntax-for-canonmath.html] that document, where layout is done with <p></p>, authors or dates are encoded as <h3>, and the structure of abstract is discussible, ***was generated by HERMES***. In fact, I wrote even the version of HERMES software generated the documents I was reviewing <quote> [See The Thermodynamics of Black Holes by Robert M. Wald in Living Reviews in Relativity generated with Hermes, version 0.9.4 2005-11-19] </quote> Therefore, if there was some error it was not from my part. Either that obtuse code was generated by Hermes or the metadata of document I cited is wrong. > just to put the things on screen. >> Do you name “semantic” the next encoding generated by HERMES >> >> <h3><a href="http://surubi.fis.uncor.edu/reula">Oscar A. Reula</a></h3>? >> >> Uff! Author encoded as heading of the document! >> > > no I don't. my guess is you've just heard a voice. I do not think so, I simple read in *Hermes at work* [http://hermes.aei.mpg.de/] This page lists results (or links to results) of </link>Hermes</link> assisted conversion [...] Then i followed the link to [http://hermes.roua.org/] and I can read in the top of the page <quote> Hermes - a semantic XML+MathML+Unicode e-publishing/self-archiving tool for LaTeX authored scientific articles </quote> I sincerely think that anyone would obtain the same conclusion I obtained that Hermes -proclaimed semantic e-publishing/self-archiving tool- is encoding authors or dates as headings of level 3 and use <p></p> for layouts. Both Hermes pages I cited are updated by a so-called Romeo Anghelache (i.e. you). >> Moreover, the mathematical code presents in the articles generated by >> Hermes are not verifying accessibility, structure is far from good, and >> several equations are rendered via “tricks”. >> >> For example, in “Hyperbolic methods for Einstein’s Equations” >> >> [http://hermes.aei.mpg.de/1998/3/article.xhtml] >> >> one reads (before equation 2): >> >> \epsilon _{abcd} is the Levi-Civita tensor corresponding to the physical >> metric >> >> The underlying math is not encoded via tensors but >> >> <math xmlns="http://www.w3.org/1998/Math/MathML"> >> <msub> >> <mrow> >> <mi>ε</mi> >> </mrow> >> <mrow> >> <mi>a</mi> >> <mi>b</mi> >> <mi>c</mi> >> <mi>d</mi> >> </mrow> >> </msub> >> </math> >> > > again, you didn't read the user manual. > at least have the courtesy to read and understand the minimal info > before bugging this list with off-topic comments. > > I'll spell it to you again, > quote from http://hermes.roua.org/ : > > Of MathML, only MathML-presentation is generated if Hermes is used to > translate legacy LaTeX files (here, by legacy LaTeX files I mean sources > which were not edited with semantic vocabularies in mind) without manual > intervention on the source. > > unquote Well, you demand courtesy but you are assuming (twice) that I didn’t read manual, claiming off-topic comments and adding another personal attacks. I am ignoring any personal attack from you. This thread is about pages containing MathML and I am proving with real examples that mathML code is being served in that pages is wrong (therefore is on-topic). 1) I never said that HERMES was generating content MathML. 2) The MathML code is being generated and served to the Internet continues being wrong, is not accessible, and the structure of math is incorrect. 3) Presentation MathML 2.0 has a specific tag for encoding tensors. You are just visually simulating tensors via a msub tag. That is not better than using old HTML for visual simulating tensors or better than using <center><b> for simulating headings... 4) Any possible advantage of using MathML (accessibility, structural markup, etc.) is broken in practice with real-word examples as those I am citing here. 5) The code generated simulates tensors instead of encoding the tensor. Any practical advantage of using standards vanishes when each guy encode math how he want|prefer|can. For example, the advantage of using standard <mfrac> for fractions is *lost* if a guy uses <mfrac> for a/b, other uses <mfrac> and redundant <mrows>, other simulate fractions using mtable, other uses a mixture of XHTML more MathML. One exemplar of last could be <span class=“num”> <math xmlns="http://www.w3.org/1998/Math/MathML" display=“inline”> <mi>a<mi></math></span><span class=“den”> <math xmlns="http://www.w3.org/1998/Math/MathML" display=“inline”> <mi>b<mi></math> </span> Above points may be your vision of mathematical markup. But and maybe you are lacking understanding on this important point, nobody here is saying that you were a bad programmer. We are simply saying that MathML code is being served in the Internet is ugly and that in practice theoretical advantages from using standard mathematical XML markup (structure, copy and paste, accesibility, standarisation...) are lost. Please do not worry if we are rejecting HERMES project and MathML. They do not fit our needs! > >> <span class="fi"> </span><span class="fn">is the Levi-Civita tensor >> corresponding to the physical metric, </span> >> >> Sorry, but I cannot call that "good code", because the Tensor is being >> rendered via a ***visual*** forcing of subscripts instead via multiscript >> tag of MathML 2.0 > > very well, implement a tool which does it (google for Levi-Civita, find > out it's a tensor, and the first or n-th symbol, or group of symbols, > should be interpreted as a tensor). > but you already proved MathML sucks all-together, why bother? > > I didn't ask you to call it "good code", really. Aha! Then the objective is approximate rendering of formulae and google... Any markup (including “tricky”, incorrect, etc.) may be permitted, is that? I wonder if that was the goal of MathML WG, but I think that was not, because then they had not introduced so many tags. >> >> And what about the redundancy of MathML ˝ in equation 2? and what about >> the "terrorific" code of equation 3? >> >> Do you name “semantic” content to encoding of “integral on s” like >> >> <mo>∫</mo><mi>d</mi><mi>s</mi>? >> >> (equation 10 of [http://hermes.aei.mpg.de/1998/1/article.xhtml]) >> > > Ok. The only wrong thing here is <mi>d</mi>. Got it? > No? Uff. It's mathml presentation, and d is an operator so it should be > surrounded by <mo>d</mo>. > This can be fixed, thanks for the unintentional pointing to a Hermes bug > that I knew already. Sorry to say this, but you are very wrong on those topics. The integral would be rendered in presentational MathML as something very close to <mrow> <mrow><mo>∫</mo><mrow><mo>ⅆ</mo><mi>s</mi></mrow></mrow><mrow>(Integrand here)</mrow> </mrow> Both the MathML code generated by HERMES <mo>∫</mo><mi>d</mi><mi>s</mi> and you recent proposal <mo>∫</mo><mo>d</mo><mi>s</mi>? simply are verbose copies of old HTML (or similar) <span>∫d<i>s</i></span>? Accessibility, audio rendering, and structural encoding of kind of output you are encouraging are wrong. The MathML WG explicitly said why one would use <mo> tags for that <quote> automatic semantic interpretation of MathML presentation elements is made easier by the explicit specification of such operators. </quote>. In practice, that XHTML article is misusing MathML rather than using it. > >> Do you consider correct the l_Planck of equation (24)? Do you know for >> what was <mtext> designed? > > the l_Plank? I don't see any l_Plank there. check your spelling. > and yes, I know "for what was <mtext> designed" I even use it, but did > you? Where? (please don't answer) I do not need to check my spelling, thanks. I wrote Planck, therefore is not surprising you are not found any "Plank". I did not in my previous message but now I will write the MathML fragment generating the last part of equation (24) in HERMES output [http://hermes.aei.mpg.de/1998/1/article.xhtml] <msubsup> <mrow> <mi>l</mi> </mrow> <mrow> <mi>P</mi> <mi>l</mi> <mi>a</mi> <mi>n</mi> <mi>c</mi> <mi>k</mi> </mrow> <mrow> <mn>2</mn> </mrow> </msubsup> which can be compared with <msubsup> <mi>l</mi> <mtext> Planck </mtext> <mn>2</mn> </msubsup> The encoding, structure, and both visual and aural (accessibility) rendering of HERMES generated fragment are wrong. Moreover, there are redundant <mrows> in the HERMES generated code that [i] add more verbosity to verbose code, [ii] complicate the DOM, and enlarge size memory requirements [iii] could generate unexpected errors and bugs in interchange of data between applications (I will write about that in a future). >> >> And what about the equation (25) of >> >> [http://hermes.aei.mpg.de/2005/2/article.xhtml]? >> >> The Gamma *there* is a tensor, but is encoded as subscript ab and >> superscript j with several redundant mrows. >> > > Again: Mathml-presentation doesn't know about tensors. So you're again > confusing things badly. > The redundant <mrows> proved themselves necessary when the automatic > conversion of legacy LaTeX files is an issue. You'll discover that when > you'll convert articles yourself with the canonical science. Of course, redundant mrows are not needed! But I am aware they are here because the software you designed is doing bad things. It is a trivial task to introduce an optimization layer after the conversion step searching mrows with a single child and restructuring the code. Even could be done in a few lines of XSLT. Maybe in a next version of HERMES ;-) About your statement "Mathml-presentation doesn't know about tensors", i will just quote the own MathML 2.0 specification. The section 3.4.7 is titled "Prescripts and Tensor Indices (mmultiscripts)" The HERMES generated code <msubsup> <mrow> <mo>Γ</mo> </mrow> <mrow> <mi>a</mi> <mi>b</mi> </mrow> <mrow> <mi>j</mi> </mrow> </msubsup> is a tricky (visual) rendering instead of using MathML tensors encoding <mmultiscripts> <mo>Γ</mo> <mi>a</mi> <mi>j</mi> <mi>b</mi> <none/> </mmultiscripts> The structure, semantics, and accessibility are wrong in the HERMES generated fragment. This is reason that MathML folks introduced a special markup for tensors in MathML, because it is trivial to notice that tensors and prescripts can be _simulated_ via combinations of <msup>, <msub>, <msubsup>, and <mrow>. On any case, this thread is not about how excellent or bad programmer you are or what are the limitations of LaTeX as inpuj syntax or how people is typing source codes. This thread is not about nothing of that! As its name suggest, this thread is about "Pages with MathML" and its objective is illustrate (via real code as that generated by HERMES) how MathML is being used in practice and how all theoretical advantages over other previous markups are being lost. >> Is that you call good semantic content? >> > > Yes. Uff! > > MathML-presentation is a layer of semantics, albeit minimal, but solves > a lot of issues with publishing math on the web, some of them being: > - converting the whole Living Reviews into XML+GIFs for mathematics, > takes about 24 hours; converting it into XML+MathML-Presentation takes > 10 minutes; and converting the whole Living Reviews into sculpted stone may take more than 24 hours, but converting it to others may take less minutes still. And using another approach (as XML-MAIDEN) the conversion takes 0 minutes. > - the resulting size is in favor to XML + MathML, especially when you > have a lot of math; compared with GIFs or stones? Sure! But the size of HERMES MathML <msubsup> <mrow> <mo>Γ</mo> </mrow> <mrow> <mi>a</mi> <mi>b</mi> </mrow> <mrow> <mi>j</mi> </mrow> </msubsup> is larger than correct MathML specification encoding <mmultiscripts> <mo>Γ</mo> <mi>a</mi> <mi>j</mi> <mi>b</mi> <none/> </mmultiscripts> and latter being more larger that using other available mathematical approaches: ISO 12083, XML-MAIDEN, or others. If you simply want visual simulation of tensors using weightless input, and you are not worry about incorrect structure, aural rendering, or semantic, then you can use ASCIIMath. Enter the ASCII input syntax Gamma_(ab)^j or the TeX/LaTeX one \Gamma_{ab}^j and the output will be <msubsup> <mo>Γ</mo> <mrow> <mi>a</mi> <mi>b</mi> </mrow> <mi>j</mi> </msubsup> without the redundant <mrow> of HERMES output ;-) One of goals of CanonML is encoding top-research math and that cannot be done in MathML (therein, I abandoned the previous CanonMath input sintax) due to unusual verbosity (size). > - copy/pasting in your math application is a huge step forward from the > GIF based math, or current math rendered in PDF. Also available in others approaches! Moreover, the correct MathML output (using </mmultiscripts>) is more friendly copy/paste than HERMES generated one. How can I select Gamma and j or just the indices a and j? Yes, grouping is not the correct in HERMES generated output because that is not a real tensor encoded via MathML tensor tags, The HERMES output is just a simulation with a superscript j and a grouped subscript ab. > These advantages make it worth the trouble of converting them even if > there are temorary bugs left, or temporary incoveniences (which should > be pointed out, thanks for all it's worth). > Temporary? >> And what about the metric equation just after the section 2.1? This is one >> of my favourites: accesibility, structure, "semantics", encoding, and >> rendering are all wrong. >> > > your comments are all wrong, more or less. > >> One find a line element ds^2. If my math is correct ds^2 = (ds)^2 but the >> code appear in the journal article generated via HERMES is >> >> <mi>d</mi> >> <msup> >> <mrow> >> <mi>s</mi> >> </mrow> >> <mrow> >> <mn>2</mn> >> </mrow> >> </msup> >> >> That is, d{s}^2 (or 2s ds), which is VERY different from (ds)^2 is >> supposed to be encoded via your "semantic" approach. >> > > this is the TeX source of that expression: ds^{2}. And the code continues being a complete nonsense. And then you verify me that *in theory* MathML was oversold as semantic, content oriented, structural, first-quality both printing and rendering, searchable, accessible to people with disabilities, but in *practice*, the HERMES output <mi>d</mi> <msup> <mrow> <mi>s</mi> </mrow> <mrow> <mn>2</mn> </mrow> </msup> is being served on the Internet is just an ultra-verbose version of the more correct and old HTML <span>ds</span><sup>2</sup> or of the ISO 12083 mathematical markup <subform>ds</subform><sup>2</sup>. > Let www-math know when you'll implement a tool which will be writing a > different MathML-presentation from that source. But wait, that will be > wrong, I can tell you that already. > > >> and all that even ignoring that one would type the differential using the >> MathML entity instead of identifier "d". >> > > missing the point again. I'm tired of repeating myself: > there's no reliable way to infer what "d" means (identifier or operator) > unless the author marks it up accordingly. An since I newer said that was way, you are inventing that. I simply reflected the MathML code is being generated and served on real-word below an atmosphere of being cool... Any promise of structural correctness, accessibility or so is lost in the large run. In many ways, MathML is doing poor those old alternatives. Take the case of the New York Journal of Mathematics (March, 2006) [http://math.albany.edu/math/demos/nyj/] They explain reasons that XHTML+MathML is (they believe) preferable for online use: The third point reads: “XHTML+MathML is a recommendation of the World Wide Web Consortium that complies with the standard Guidelines for Accessibility.” I wonder what “accessibility” is providing code such as that from HERMES <mi>d</mi> <msup> <mrow> <mi>s</mi> </mrow> <mrow> <mn>2</mn> </mrow> </msup> for encoding (to say!) “the square of the differential of s”. Even accessibility of old HTML+GIF+ALT model is better than above real 2005 published MathML code! In no doubt, George’s XML-MAIDEN is far from better than I am seeing in the real world. I wait that CanonML can be better still since I am using his knowledge as base for further development! [snip] Juan R. Center for CANONICAL |SCIENCE)Received on Saturday, 15 April 2006 15:21:21 GMT

*
This archive was generated by hypermail 2.2.0+W3C-0.50
: Saturday, 20 February 2010 06:12:58 GMT
*