W3C home > Mailing lists > Public > www-math@w3.org > December 2013

Re: Errors in XML Entity Definitions for Characters

From: David Carlisle <davidc@nag.co.uk>
Date: Mon, 30 Dec 2013 02:05:07 +0000
Message-ID: <52C0D4D3.1070709@nag.co.uk>
To: Frédéric WANG <fred.wang@free.fr>, "www-math@w3.org" <www-math@w3.org>
On 29/12/2013 12:05, Frédéric WANG wrote:
> Hi all,
> Just a quick feedback on the unicode.xml file provided in the "XML
> Entity Definitions for Characters" spec:
> - Some characters have mathclass="R?" (with a question mark)... I
> guess that's because the mathclass is not clear, but it is
> unexpected for someone who wants to process the file
> automatically...
> - Some Arabic Letters (U+0627-U+063A and Arabic mathematical
> alphabetic symbol) should probably have mathclass="A" since they are
> used as mathematical variables.

Frédéric, thanks for your comments...

The <unicodedata> element is all automatically extracted from files
available from the Unicode site in particular the mathclass attribute
(including the ?) comes from


However I note this has been updated (twice) since then and


does not have the ? (but hasn't got the arabic updates that you suggest)
We should probably pass those comments on to the UTC. But I will rebuild
based on the revision-13 data.

> - Some LaTeX commands map to different Unicode code points. For
> example \mathsfbf{\Alpha} maps to both the capital and small (bold
> sans-serif) alpha, which is clearly wrong (one should be
> \mathsfbf{\alpha}). See the attached diff files for details. They
> were generated using the attached XSLT stylesheet and the Unix
> command  "xsltproc extract.xsl unicode.xml | sort --key=2,2 >
> commands1.txt; cat commands1.txt | uniq --skip-chars=7 >
> commands2.txt ; diff -U8 commands1.txt commands2.txt >
> commands.diff".

Ah yes the latex mapping part of the file is actually the original part
(and was largely experimental since there wasn't really any latex
unicode font support back in 1999. One thing on my stack of things to do
one day is to bring that into the 21st century and align with currently
used LaTeX packages. One excuse for putting this off so far is that the
official LaTeX support for the stix fonts has still not been released
and it would be good to use the same command names as that, However I'll
fix the ones you have reported.

Received on Monday, 30 December 2013 02:05:30 UTC

This archive was generated by hypermail 2.3.1 : Monday, 30 December 2013 02:05:30 UTC