Date: Thu, 18 Jul 1996 15:50:31 -0700 From: Thomas Breuel <tmb@best.com> Message-Id: <199607182250.PAA08088@shellx.best.com> To: www-html@w3.org Subject: semantic markup for math |I think what you mean is that the little context |definition i made up on short notice doesn't cover the semantics you |need for that formula. That's the point: using semantic markup for math requires the semantics of formulas to be defined for each and every field of mathematics that wants to publish on the web. You seem to think that that is easy. I think it's exceptionally difficult. The argument that MINSE is extensible doesn't help: if "semantic markup" for math has any utility at all, it requires that people agree on the semantic markup, not make it up on the fly. |That's akin to complaining that we force poets to write their poems |themselves. Authors will author; they will not have to "typeset" |unless they want to have precise typesetting control. Author says |"integral", you get an integral; author does not have to say "draw |a stretchy integral symbol from this symbol font, centered on this |line, vertical height matching this box, if in graphics mode; or |say "integral with respect to" etc., if in speech mode; or..." Authors and poets usually don't do semantic markup at all. They often use pen and paper, which is all about layout and has no structural information whatsoever. Furthermore, authors of mathematics are often very particular about the rendering of their formulas: they don't want just some representation of a formula, they want the particular representation, bold-facing, spacing, and subscripting that is common in their subfield, at their university, or used by their teachers. By adding semantic markup, you are adding a completely new set of requirements and burdens to authors. |> there is a strong risk that browser vendors |> won't implement it and that users won't or even can't use it. | |Browser vendors haven't taken a stance yet. And i honestly |can't imagine an HTML author who is incapable of writing | | <se> 'sin(2*A) = 2*'sin(A)*'cos(A) </se> | |when she wants to express a trig identity. By "can't" I'm referring to the fact that the semantic primitives for the user's mathematical specialty are missing, not that the user is too stupid to figure out how to. |> Unlike MINSE, I actually can typeset even |> the formula on the cover of my freshman year math textbook with it. | |Like i said, i don't think this is true. What's the formula? \int_{\partial D^+} \boldomega = \int_{D^+} d\boldomega |Subject: Look: real math! |Go visit http://www.lfw.org/math/ams-example.html That's an interesting example, for several reasons. First of all, it isn't really semantic markup, but a strange mix between semantic markup and layout. For example, you couldn't automatically tell from the notation which variables are scalars and which ones are vectors, you couldn't automatically translate the integrals into some other common notations, and you couldn't translate the formulas into FORTRAN notation either (something that Macsyma's and Mathematica's structural notations give you). With your other web pages, I was assuming that you simply hadn't finished fleshing out the details. If you present this as an example of "semantic markup for real math", I can only say that your semantic markup isn't very semantic after all but a kind of variant of LaTeX notation that uses different quote characters and function call notation instead of infix notation in some places. Second, like all the examples in MINSE that I was able to find, it comes from a particular (though relatively common) branch of applied and engineering mathematics, not what I would call "real math". |> -- large amounts of existing, on-line math is not in a |> structural representation and cannot be converted |> automatically | |This is only the case if it doesn't contain sufficient information |for deployment on the Web in the first place. If the original |source contains the information necessary to present it on the Web, |then it follows that it is unambiguous enough to convert. I just don't see how you can say that. LaTeX formulas certainly contain enough information for "deployment on the web". They can be rendered on different output devices, scaled, and linearized and read out. Given that eqn could be rendered approximately as ASCII, I suspect that LaTeX style formulas can be as well. If you don't like the simple linear rendering the markup itself gives you, you can add a textual alternative. They are also general purpose enough to typeset most of mathematics, with a fixed set of primitives. The HTML 3.0 math specs seem like a reasonable, pragmatic approach to typesetting math on the web. I can pull most of my Yellow books off the shelf and render and publish the formulas in them using it. They are intuitive for anybody who has used LaTeX (or eqn) before. I don't see significant additional utility that comes from more semantic markup for web publishing, and I don't think you have given a compelling argument yet for it (or a convincing specification). On the other hand, defining and deploying semantic markup would be a huge undertaking, and I fear going down that path would put standardization of any kind of mathematical markup for web documents on indefinite hold. Thomas. PS: As a mathematics typesetting system, MINSE is quite a nice piece of work. But please let's keep that separate from the issue of what kind of markup for math is best for mathematical publishing on the web.