Using MathML to describe the relationships between units-of-measure

Hi Folks,

Many thanks to Stan, Bill, and Neil for answering my questions.

I am using MathML to express the relationships between 
units-of-measure.  Below is a description of the approach.
In example 2 (below) I describe a set of equivalent
cartesian-coordinate/polar-coordinate pairs.  I am not
sure how to create the MathML for it.  Any help that you
could provide would be greatly appreciated. /Roger
----------------------------------------------------------------------
EXAMPLE 1

To explain the approach I will use this example:

   <River id="Yangtze">
        <length>
            <kilometer>6340</kilometer>
        </length>
   </River>

Recall that <kilometer> may be considered as a function:

    kilometer(length(Yangtze)) --> 6340

"kilometer maps the length of the Yangtze to 6340."

More generally,

    kilometer(Distance) --> number

"kilometer maps an (abstract) Distance class to a number."

Likewise, for this version:

   <River id="Yangtze">
        <length>
            <mile>3914</mile>
        </length>
   </River>

mile is a function that maps a Distance to a number:

    mile(length(Yangtze)) --> 3914

or more generally:

    mile(Distance) --> number

Both kilometer and mile are functions that apply to Distance. 
Two Distances may be the same, but are represented differently
(using different units-of-measure).

Now we arrive at the approach.  The approach is this: 

    - Identify a class which is independent of units-of-measure (uom).
           In this example it is the abstract Distance class.
    - Define in this UOM-INDEPENDENT CLASS the *set* of
       UOM-DEPENDENT INSTANCES that are equivalent.
           In our example it is the set of kilometer/mile pairs which
           represent equivalent Distances.

This is how to define the Distance class:

<Class id="Distance">
    <equivalentInstances>
        {k, m | k = m * 1.62}
    </equivalentInstances>
</Class>

"Those instances that represent equivalent Distances are 
the set of k (kilometer), m (mile) pairs such that k = m * 1.62"

For example, (6340 k, 3914 m) represent equivalent Distances.

Thus, determining if these two descriptions:

   <River id="Yangtze">
        <length>
            <kilometer>6340</kilometer>
        </length>
   </River>

   <River id="Yangtze">
        <length>
            <mile>3914</mile>
        </length>
   </River>

represent the same Distance is a matter of checking
for set containment:

    (6340 k, 3914 m) in {k, m | k = m * 1.62}

API FOR CLASSES DOCUMENT

I envision a document containing class definitions (e.g.,
the Distance class). I envision an API to enable programs to
programmatically access the class info via the API.

This API may have a method such as this:

     boolean isContainedIn((uom1_value, uom2_value), set);

The method, isContainedIn, has two arguments. The first argument 
contains a pair of unit-of-measure values, and the second argument 
is a set definition.  The result is a boolean.

MathML FOR REPRESENTING SETS

MathML is an XML vocabulary that can very nicely express sets.
At the bottom of this message I show how to represent the above set 
using MathML.

I recommend that MathML be used as the (default) way to represent sets:

<Class id="...">
    <equivalentInstances>
        -- set definition using MathML --
    </equivalentInstances>
</Class>

Note: MathML can do more than just express sets.  However, for 
this work I believe that we can restrict ourselves to just using 
the set expression vocabulary of MathML.  This will simplify 
tools greatly.

OTHER WAYS TO REPRESENT SETS

Interestingly, MathML also allows sets to be expressed using other
representations (MathML calls this "semantic mapping").  Thus, 
there may be other ways to represent sets as well:

<Class id="Distance">
    <equivalentInstances>
        {k, m | k = m * 1.62}expressed as MathML
        {k, m | k = m * 1.62}expressed as Mathematica
        {k, m | k = m * 1.62}expressed as xPath(???)
    </equivalentInstances>
</Class>

I am especially intrigued by the idea of expressing sets using 
xPath 2.0. Hopefully someone with familiarity with xPath 2.0 
might answer whether it can express sets such as 
{k, m | k = m * 1.62}?

OTHER EQUIVALENT DISTANCES

Of course there are other sets of equivalent Distances, and 
they would also be expressed in the Distance class.  For example, 
below I have added a set which defines equivalent
inches(i)/centimeters(c) Distances:

<Class id="Distance">
    <equivalentInstances>
        {k, m | k = m * 1.62}
    </equivalentInstances>
    <equivalentInstances>
        {c, i | c = i * 2.54}
    </equivalentInstances>
</Class>

There will be an <equivalentInstances> element for every
set of equivalent Distances.

That's the approach.  I believe that it has a lot of merit, but 
there are issues that remain to be solved.  I would greatly 
appreciate your input on these.  

ISSUES

Issue #1: Above I said that:

    (6340 k, 3914 m) in {k, m | k = m * 1.62}

should evaluate to true, i.e., (6340 k, 3914 m) is in the set.
That is not quite true.  If you multiply 3914 times 1.62 you
get 6340.68
So technically (6340 k, 3914 m) is not in the set.  Do you have
suggestions on how to handle this?  I am thinking that in the API
that I mentioned above:

     boolean isContainedIn((uom1_value, uom2_value), set);

there must be an indication of the "tolerance" allowed.  So this API
has a third argument:

     boolean isContainedIn((uom1_value, uom2_value), set, tolerance);

where tolerance indicates how much "slack" is allowed in determining
set containment.  This use of tolerance can be extremely powerful.  
See the next example to see how tolerance may be used to specify 
the size of a Location area.

Do you have suggestions on how to represent tolerance?

I am sure that there are other issues that I haven't thought of yet.  
If you see other issues please let me know.

EXAMPLE 2

The second example is concerned with being able to recognize
that two Maps, using different coordinate systems, reference the 
same location.  Here are two Map documents:

<Map id="M1">
    <location>
        <cartesian-coordinate>
            <x>
                <kilometer>100</kilometer>
            </x>
            <y>
                <kilometer>100</kilometer>
            </y>
        </cartesian-coordinate>
    </location>
</Map>

<Map id="M21">
    <location>
        <polar-coordinate>
            <r>
                <kilometer>141.421</kilometer>
            </r>
            <theta>
                <radian>0.7841</radian>
            </theta>
        </polar-coordinate>
    </location>
</Map>

The first thing to do is to determine a class which is independent 
of units-of-measure.  For this example it is a Location class.   
A physical place has a Location independent of units-of-measure:

<Class id="Location">
    ...
</Class>

The next step is to determine the set of equivalent instances.  
For this example I am just concerned with stating equivalent 
(cartesian-coordinate, polar-coordinate) pairs.  Here is the 
set definition:

{c, p | c.x.kilometer = p.r.kilometer sin p.theta.radian ^ 
                     c.y.kilometer = p.r.kilometer cos p.theta.radian}

Here is the completed Location class:

<Class id="Location>
    <equivalentInstances>
        {c, p | c.x.kilometer = p.r.kilometer sin p.theta.radian ^ 
                     c.y.kilometer = p.r.kilometer cos p.theta.radian}
    </equivalentInstances>
</Class>

Note: I am not convinced that this is the best way to express the set.
Do you have any suggestions?

As noted earlier, the set definition is expressed using MathML.
(I haven't created the MathML for this set.  If someone more skilled
with MathML than I would care to do so, I would greatly appreciate it.)

API  REVISITED

Recall the API that was mentioned above:

    boolean isContainedIn((uom1_value, uom2_value), set, tolerance);

Recall that "tolerance" is used to indicate how much slack you 
are willing to accept in deciding if a pair of values are contained 
in the set.  In this Map example the tolerance is effectively a 
way of indicating "I want all values within this region".  When 
you tighten the tolerance, you narrow the region.  When you loosen 
the tolerance, you widen the region. I think this will be very useful.

----------------------------------------------------------------------
Below is the MathML representation for example 1.  For those 
not familiar with MathML here are some notes:

- operations are expressed in prefix notation, i.e., op arg1 arg2
- everything is wrapped with a math element
- a set can be defined and then assigned to an identifier, e.g., S
- ci = content identifier, i.e., think variable
- cn = content number, i.e., think value
- bvar = bound variable
- condition is read as: "such that"
- eq = the equality operator
- apply = this takes an operator and arguments; it applies the 
          operator to the arguments
- the <list> at the end is used to identify the set variables (k, m)

<Class id="Distance">
    <equivalentInstances>
        <!--   S = {k, m | k = m * 1.62}   -->
        <math>
            <declare type="set">
                <ci>S<ci>
                <set>
                    <bvar
definitionURL="http://www.openmath.org/cd/units_metric#kilometer"><ci>k</ci></bvar>
                    <bvar
definitionURL="http://www.openmath.org/cd/units_imperial#mile"><ci>m</ci></bvar>
                    <condition>
                        <apply>
                            <eq/>
                            <ci>k</ci>
                            <apply>
                                <times/>
                                <ci>m</ci>
                                <cn>1.62</cn>
                            </apply>
                        </apply>
                    </condition>
                    <list><ci>k</ci><ci>m</ci></list>
                </set>
            </declare>
        </math>
    </equivalentInstances>
</Class>

Now, set containment is expressed as:

   (6340 k, 3914) in S

Comments?  Suggestions?

Received on Monday, 21 July 2003 07:46:18 UTC