W3C home > Mailing lists > Public > www-math@w3.org > July 2009

Default token for Unicode character

From: Urs Holzer <urs@andonyar.com>
Date: Fri, 24 Jul 2009 11:41:24 +0200
To: www-math@w3.org
Message-Id: <200907241141.24410.urs@andonyar.com>
Hi together

I have to implement the following in my Presentation MathML editor: When 
the user inserts a unicode character, the editor has to guess which 
token element to use, mi or mo. So, the question is: How to find a sane 
default token element for a given unicode character?

I hoped that the Unicode character database would help. However, I am 
not able to get that right. At the moment I do it like that: For 
characters in the general category Punctuation and Symbol, I use mo. For 
Punctuation, this might be correct. But for Symbol it is cerainly not. 
For example, ∞ (U+221E INFINITY) is in category Sm (i.e. Symbol) but it 
is an identifier. On the other hand, ⇔ (U+21D4 LEFT RIGHT DOUBLE ARROW) 
is as well in the category Sm, but it is an operator.

I also tried to get the information from
which comes along with the technical report 25
Unfortunately, I do not understand what they mean by the class N. It 
contains for example ∞ (U+221E INFINITY) and !, that is, it contains 
some operators and some identifiers. So now I know as much as before.

Any ideas?

Received on Friday, 24 July 2009 09:42:04 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:27:41 UTC