- From: Peter Murray-Rust <pm286@cam.ac.uk>
- Date: Thu, 15 Mar 2012 18:37:38 +0000
- To: Roger Martin <mathmldashx@yahoo.com>
- Cc: "www-math@w3.org" <www-math@w3.org>
- Message-ID: <CAD2k14NhDLHTueUNuP2kbm7YwEvQfyew6KR7ZADV4OJw6=qTug@mail.gmail.com>
On Thu, Mar 15, 2012 at 5:59 PM, Roger Martin <mathmldashx@yahoo.com> wrote: > Hi Peter, > > What Daniel suggests is very much what I do while applying xslt transforms > as just-in-time coding of math from mathml documents. > The xslt would recognize any > <m:apply> > <m:eq /> > > <m:ci>k</m:ci> > <m:cn>1.0</m:cn> > </m:apply> > as a field with getters and setters in the generated class. Arrays > treated the same way. Also the final results could be gotten back the same > way where a mathml document needs to return more than one argument to the > engine's scope. > > I've explored many different avenues and purposes of just-in-time > coding. Went so far as exploring specialized xslt engine combined with > direct byte-code manipulation (http://asm.ow2.org/) or producing output > in OpenCL (http://www.khronos.org/opencl/) format running them via JavaCl( > http://code.google.com/p/javacl/) or the nvidia tool kit. > > This approach you are looking at can actually simplify use of gpu's etc. > because it is solving the immediate runtime needs rather than try to be > generalized, solving everything for everybody like conventional precompiled > code attempts to do. > > I've probably gone round the same roundabouts. I've used XSLT 1.0 (don't use 2.0 because of portability). Written one-xsl-per-element including some fairly hairy things for drawing molecules using SVG. I've come to the conclusion that XSLT doesn't scale well for large systems, especially where other libraries are involved. So I use a DOM, specifically XOM from xom.nu which is much simpler and better than W3C DOM. It allows subclassing , and in CML every element has its own sublcass (about 110).I'm doing the same with MathML - so far it's going very well. Each element has a class such as CIElement or APPLYElement which is populated by recursive descent when the MathML is parsed. (Since I don't know MathML well I can't validate on the fly). I also don't need (at present) to build the MathML programmatically as we are using given functional forms. (In CML much of the work is programmatic building of chemistry). Each class has a function eval() which evaluates the MathML where possible. For examples numbers can be added and multiplied. variables (ci) can be set programmatically and this means that expressions can often be evaluated to a single double (I support integers and doubles). The classes can also have other generic functionality and I am experiementing with differentiate(). The main challenge - as we have discussed, is the scope and exactly how we assign variables. I am probably skipping over important semantics but I use: <apply><eq/><ci>x</ci><cn>1.2</cn></apply> To populate a variable in the scope of the containing <math> element. An x in a subsequent expression is then replaced by 1.2 . This may be naive mathematically but it works for me :-) > I'd enjoy more discussion of applying content mathml is this area. > > Excellent - as long as the list members don't mind I am happy to continue. My code is at http://www.bitbucket.org/petermr/mathml and is deployed in http://www.bitbucket.org/petermr/semantic-forcefield It's only 3 days old. It runs under Java/Maven and I'd be delighted if anyone wants to play. Ultimately I can see that undergraduate cheistry textbooks and many research papers could be written in this way. It's a very good discipline for understanding the semantics of the domain Roger (We've talked in the past about ANTLR parsers for quixote-qcdb) > Indeed! We use ANTLR for postprocessing Nat Lang Processing output (ANTLR cannot scale to the complete problem which requires heuristics). -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069
Received on Thursday, 15 March 2012 18:38:07 UTC