Re: Basic requirements for mathematical-scientific language

I've been monitoring this thread - after all, how could I not when it has filled my inbox for the past few weeks! I think now we have a modified and streamlined topic, I'll put my thoughts in...

In my opinion, every programming/scripting/descriptive language has its advantages and its limitations. XML is great because it provides a platform-independent easy way to write and process documents, and those documents are human-readable, requiring no other software to write them. The disadvantage is always its verbosity - could you imagine trying to construct a 3D model in XML format? Even with extraneous whitespace removed and shortened element names you'd have constructs like:

<v><x>100</x><y>10</y><z>80</z></v>

where 'v' is a vertex, and 'x','y','z' are the coordinates. But a quick calculation tells us that such a file encoded with 8-bit characters will give you 35 bytes worth of data. The equivalent proprietory format will probably use 1 byte as a vertex header and 2 bytes for each coordinate, giving only 7 bytes overall - exactly 1/5 the total file size consumption. For large models, of 5MB or so, clearly this is a great saving. Of course, numbers like 10 can be represented with only 4 raw bits if you have the need to do so - but the XML characters "10" occupy 2 bytes...

This seems a bit wierd, but my point is that XML is not suited to every application - neither is MathML... MathML is designed to be a human-readable, semantically rich language which is extensible [okay, you can disagree with me here, but don't argue the point further]. If you have applications that require specific constructs, then you can optimise those constructs, but then that language is unlikely to be portable. It is extremely difficult (if not impossible) to develop a language which is extensible, semantically rich and compact.

So, in your example, you quote a 7GB piece of data; fine, why not condense that down into binary format and save file space? If you know exactly how your formula/construct is always going to be structured, and especially if it is only numerical, then why not optimise it completely rather than trying to use an all-purpose language like MathML?

Back to MathML, and I had the same verbosity problem writing MathML documents. Most of my work is now written in MathML, but I don't do the XML by hand - that would take forever and give rise to many mistakes. Instead, I designed my own software converter which takes a more condensed syntax and converts it to MathML Content, to which an XSLT sheet is applied to convert it to Presentation. For example:

Int[phi*Dot[grad[psi],Vector[n]],V]

which gives you the integral over the volume V of the scalar phi multiplied by the dot (scalar) product of the gradient of scalar function psi with the vector n. The equivalent MathML syntax is incredibly long, much longer than what I just typed, but there are limitations to my language as well: it's convenient for me to use, and it produces MathML which is great, BUT every function has to be programmed to convert it into correct MathML - so Dot gets turned into an <apply> with a <scalarproduct /> child etc. The limitation is that I have to program new functions when MathML doesn't already have them[1], but even so this syntax is a real time saver when writing documents, and in my opinion it beats TeX just because MathML has a semantics content side as well.

Anyway, as I've heard said on this list time-and-time again, if you don't find MathML does/can do what you need, create something else to satisfy you - and if you can make it interoperable with MathML then great, but if your aim is to produce a more condensed but less flexible language, then so be it.

Having an interest in all things computery as I do, I would also like to mention that you say "MathML is unnaturally verbose and redundant" - actually, have you thought about your operating system? If you're running Windows and you're trying to do a large calculation, it will run slower than if you used the microprocessor directly, because Windows takes many clock-cycles for its own background processes. If you really want to run a fast calculation, you'd want to program directly in assembly or even binary instruction sets and use a processor that way... But the code for that is far less elegant and far more "unnaturally verbose and redundant" than any higher-level language (C/C++/Java/.NET) and yet gives far superior results. The moral is: it isn't always the most compact syntax that is the best. Note also that any software using XML has to run an XML parser and probably a DOM model in memory as well, using more resources in the process. Again, XML is not the best choice when performing large/long calculations.

Please don't be discouraged by what I've said - I'm strongly interested in progress, but I would like to see detailed specifications/drafts of your language, its syntax and its merits against MathML, and not just proposals. Once you can show me (and I would suspect the rest of the list) that your language is actually more useful/structured/compact/extensible than MathML, or whatever benefits it has, I'll be more inclined to look at it favourably and perhaps even contribute or adopt it myself!

Regards,
Charles.

[1] actually this turns out to be a minor issue because the function library is XML-based and can be edited and reloaded while writing a document.


--- juanrgonzaleza@canonicalscience.com wrote:

From: <juanrgonzaleza@canonicalscience.com>
To: <www-math@w3.org>
Subject: Basic requirements for mathematical-scientific language
Date: Tue, 18 Apr 2006 03:03:34 -0700 (PDT)

For avoiding confusion, i am resending previous message with a different
topic name. If moderator agree previous message could be erased from
archives.

[ omitted ]

Juan R.

Center for CANONICAL |SCIENCE)



_____________________________________________________________
http://www.easypost.com Anti-Virus & Anti-Spam Web Mail thats hotter than hot

Received on Wednesday, 19 April 2006 19:24:24 UTC