Basic requirements for mathematical-scientific language from juanrgonzaleza@canonicalscience.com on 2006-04-18 (www-math@w3.org from April 2006)

From: <juanrgonzaleza@canonicalscience.com>
Date: Tue, 18 Apr 2006 03:03:34 -0700 (PDT)
To: <www-math@w3.org>
Message-ID: <3379.217.124.88.179.1145354614.squirrel@webmail.canonicalscience.com>
For avoiding confusion, i am resending previous message with a different
topic name. If moderator agree previous message could be erased from
archives.

I have updated some basic requirements for a generic mathematical markup
language for scientific requirements at the next link.

[http://canonicalscience.blogspot.com/2006/04/scientific-language-canonml-is.html]

Some requirements fit into the XML model and could be considered for
debate for the future mathML specifications. Other requirements do not
fit and will be developed in alternative mathematical approaches to
those from the w3c from the Center for CANONICAL |SCIENCE).

Some requirements were presented in the past

[http://canonicalscience.blogspot.com/2006/02/choosing-notationsyntax-for-canonmath.html]

but that document will be updated.


=== Basic requirements =============


Data optimisation
-----------------

MathML is unnaturally verbose and redundant. Whereas in practice this is
not a serious problem for encoding simple formulae as E=mc^2, it is a
problem for scientific databases and for computation or interchange of
information.

In shorthand notation, the Redfield equation reads

(partial rho) / (partial t) = (L + R) rho

where R is the Redfield tensor. But equation stored for a small
physicochemical system of current interest needs of the order of 7 GB of
memory.

Taking an x10 verbosity factor, we would need 70 Gb in MathML for the
same equation.

The Redfield equation is an ultrasimplified version of more general
equations.

For example, following MathML 2.0 specification matrix

0 1 0
0 0 1
1 0 0

is encoded as

<matrix>
  <matrixrow>
    <cn>0</cn><cn>1</cn><cn>0</cn>
  </matrixrow>
  <matrixrow>
    <cn>0</cn><cn>0</cn><cn>1</cn>
  </matrixrow>
  <matrixrow>
    <cn>1</cn><cn>0</cn><cn>0</cn>
  </matrixrow>
</matrix>

but can I use this ultraverbose encoding for Detour matrices of
scientific interest? Detour matrices are N x N ones. In mathematical
chemistry, N is of the order of the size of a chemical compound.

I do not consider elegant and coherent encoding big (N = 1000) Detour
matrices using MathML. Is it?


Encoding of non-hierarchical structures
---------------------------------------

This may be useful on quantum mechanical models.


Extensibility
-------------

Currently MathML presentational markup is not, and not all people agree
on extensibility of Content MathML.


Backward compatibility
----------------------

Language would be more close possible to popular existent systems. I
mean: TeX, LaTeX, Mathematica, Maple, Fortran, Lisp, C, ISO 12083, AAP
Math, some scientific DTD (Elsevier one), etc.

This also includes compatibility with CSS, HTML and others.


Formal language
---------------

For example, SXML is directly based in SEXPR and permit us to exploit
formal structure for abstraction layers.

I agree with mathematician Chaitin on the possibilities of computerized
versions of set theory.


Simplicity
----------

The good and concise is twice good!

The language would be directly manipulated and encoded by humans.

Another “advanced” site where (ds)^2 is being incorrectly served as 2s
ds is Distler’s blog MUSSINGS.

If you rely on tools and you are trained to never see the underlying
code (MathML is popularly presented as a kind of hidden mathematical
postscript) you do not know you are encoding.

MathML ultraverbose code

<mrow>
<semantics>
  <mrow>
    <msubsup>
      <mo>&int;</mo>
      <mn>1</mn>
      <mi>t</mi>
    </msubsup>
    <mfrac>
      <mrow>
        <mo>&dd;</mo>
        <mi>x</mi>
      </mrow>
      <mi>x</mi>
    </mfrac>
  </mrow>
  <annotation-xml encoding="MathML-Content">
    <apply>
      <int/>
      <bvar><ci>x</ci></bvar>
      <lowlimit><cn>1</cn></lowlimit>
      <uplimit><ci>t</ci></uplimit>
      <apply>
        <divide/>
        <cn>1</cn>
        <ci>x</ci>
      </apply>
    </apply>
  </annotation-xml>
</semantics>
</mrow>


for \int_1^t \frac{dx}{x} may be avoided. _Difficulty of the encoding
would be of same order than in TeX_.



Juan R.

Center for CANONICAL |SCIENCE)
Received on Tuesday, 18 April 2006 10:03:38 UTC