- From: Richard Kaye <R.W.Kaye@bham.ac.uk>
- Date: Thu, 30 Mar 2006 21:13:52 +0100
- To: www-math@w3.org

On Thursday 30 March 2006 15:14, Bruce Miller wrote: > Richard Kaye wrote: > > On Thursday 30 March 2006 12:35, Paul Libbrecht wrote: > >>W Naylor wrote: > >>>I though to try out the ORCCA tex -> MathML translator on your > >>>example:I input the document: > >>>\documentclass[11pt]{article} > >>>\begin{document} $$3*a+b$$ \end{document} > >>>and get out the MathML: > >>><math xmlns="http://www.w3.org/1998/Math/MathML" display="block" > >>>overflow="scroll"> > >>><mn>3</mn><mo>*</mo><mi>a</mi><mo>+</mo><mi>b</mi></math> > >>>now this is machine generated, (though I suspect that many authors would > >>>be lazy and wouldn't put an mrow around the 3*a, if they were creating > >>>this by hand) > >> > >>Well, that's an example where solution-1 > >>(presentation-tree-based-selection) is doing the same as text > >>selection... it clearly is wrong but if the author is aware of it, he > > > > Actually, it is only clearly wrong with standard semantics where > > + = addition and * = multiplication on some standard field such > > as the real numbers, and using standard conventions on precedence > > (and perhaps in a context where you are using standard classical > > logic to discuss real numbers). > > I'd argue it's wrong in any case, or at least of dubious meaning; > What does a construct like "a op1 b op2 c" mean?. It's just that > the "right" form is not apriori clear, without knowing the author's > intended notations. Well, if you want to be discussing the meanings of these rows we need to have some input with semantic content, like in OpenMath, and not p-MathML. p-MathML says it "means" a row of symbols to be rendered in some way the renderer sees fit. I like p-MathML precisely because it *doesn't* make any attempts to attach any other "meaning". In other words when my research leads me to new mathematics no one else has thought of, I know I can always render it in p-MathML. I could also express it semantically in OpenMath if I am willing to write CDs. I couldn't express it in c-MathML, because that is only concerned with the meanings in some kinds of maths that people worked out years and years ago. > And if you're opening the can'o'worms > of non-standard notations, why assume that * and + are infix > operators at all? Maybe "3*" is a prefix opererator acting on "a" ? There are no such assumptions. This is a string of symbols with a default suggestion to the renderer that * and + are "infix" (but it is not really clear what "infix" means other than by examples of standard practice and it is up to the renderer how this should affect its rendering). There are also some default suggestions as to the spacing on either side of each symbol. > By default TeX assumes they are operators, but, like MathML, > there's no precedence associated with them. Actually, TeX doesn't assume anything like this either. It has classification of symbols (eg \mathbin, \mathop, \mathrel and \mathord I think) that tells it via a slightly complicated algorithm how much space to put between two symbols. I looked at this very hard a long time ago when I realised that for the sort of maths I do this algorithm didn't work. Fairly reasonably, I wanted some operators like \wedge (meaning "and") to have more space round them than relations like < which should have more space than arithmetic operators like +. I also wanted < to have less space when used in contexts like "forall x<y". There are simply not enough levels in TeX. In the end I gave up and now I adjust things by hand when it doesn't look right. (BTW It would be interesting to hear other people's ideas on how to achieve this in p-MathML... I have my own ideas but they're not very nice.) > Unless the author's markup is has explicit structure, > whatever agent is translating to MathML will need to parse. > Ideally that agent would allow for non-standard notations, > but the standard makes a good default. Agreed. And if you are using non-standard notations you'll probably be writing a lot of lspace="..." rspace="..." and form="..." attributes too. (And a lot of hoping and praying that the renderer does the right thing...) > To get back to Paul's original question; Oh yes. Sorry for the rant :) > Have you thought of > taking a hybrid approach? Ie. expand the selection based > on presentation-tree considerations, and then _if_ there are > parallel markup linkages, adjust the selection as needed. > That would seem to do as much fixup as you can, given whatever > markup you're given. > > Depending on what the selection is _for_, however, a pure > single content subtree might not be what's desired, however. > It might be reasonable to select multiple subtrees provided > they are adjacent siblings. Assuming the above example were > properly nested (using standard precedence :> ), "*a" > (two subtrees) might be a useful selection that would fit > the criterion. OTOH, you wouldn't be able to select "*a+", > which is a good thing. With standard notations and meanings and *=multiply, "*a" might mean the postfix operator of multiplication by a. "*a+" might mean the infix operator of multiplying the lefthand argument by a and then adding the result to the righthand argument. Who knows? it's just possible someone might actually want this. I would simply allow the selection of anything that could possibly go in an <mrow>...</mrow> as defined in the DTD or schema (and return the code *with* the implied <mrow>...</mrow> for safety). Then you'll need two options for pasting it: either to paste the whole mrow as a single item into an object, or to paste the content of the mrow as a list of several objects into an object. Both are needed. I can't think of anything simpler. Best wishes to all. Richard

Received on Thursday, 30 March 2006 20:16:48 UTC