[whatwg] Mathematics in HTML5 from Øistein E. Andersen on 2006-06-11 (public-whatwg-archive@w3.org from June 2006)

From: Øistein E. Andersen <html5@xn--istein-9xa.com>
Date: Sun, 11 Jun 2006 20:47:00 +0200
Message-ID: <E1FpUxg-0003U4-00@ws7.ou-data.net>
On 10 Jun 2006, at 10:1AM, White Lynx wrote:

>Oistein E. Andersen wrote:
>>traditional French typographical conventions for mathematics require lowercase
>>variables in italic, but uppercase ones in roman.

>Do we need extra values like text-transform:french-italic; and french-bold-italic;
>that would transform lowecase Latin and Greek characters to appropriate slanted >mathematical alphanumerical characters and uppercase ones to normal >mathematical alphanumerical characters?

See [1] (in French) for some examples of French-style mathematics.

The problem is more complex, however. French typography traditionally use upright Greek letters ([2] acknowledges this); TeX uses italic lowercase Greek, but upright uppercase Greek by default; today, mathematicians' preferences on this issue diverge. Moreover, the mathematical characters in Unicode cover not only Latin and Greek letters, but also digits.

To handle this properly, we would need separate transformation rules for each of the following five sets: uppercase Latin, lowercase Latin, uppercase Greek, lowercase Greek, digits. I currently tend to think that we should rather let the transformation rules apply to all these characters, and simply not use them when they are not needed.

By the way, nothing like `mathematical alphanumerical characters' seems to be defined in Unicode, so no transformation should be needed for those.

[1] http://omega.enstb.org/yannis/pdf/article-gut99.pdf 
[2] ftp://tug.ctan.org/pub/tex-archive/fonts/fourier-GUT/doc/latex/fourier/fourier-doc-en.pdf

>roman as a default is Ok as a first approximation.

Roman as default is OK given that italic variables can be marked up easily. (Many mathematicians will obviously reject roman variables.) I think we agree on this point.

>Script, fraktur and double-struck letters are available as separate Unicode
>characters, so font-family may not be the best solution for these scripts.

A tiny subset is available in the Letterlike Symbols block, but most of them are not. The full set is located in the Mathematical Alphanumeric Symbols block in plane 1, so the problem is exactly the same for these as for italic or bold mathematical characters.

Each approach has its problems. Anyway, the specification should probably not try to avoid
the issue of font selection.

>This issue [font selection] belong to presentational layer and has to be addressed 
>on CSS side (there are no problems on XSL and/or DSSSL side as one can make
>appropriate transformations).

I am not so sure about that. The difference between bold, italic and script can be just as important as the distinction between base, superscript and subscript. Explicit mark-up is provided in ISO-12083, MathML and TeX, among others, and mathemtical characters (which we would like to use, but cannot -- yet?) have been added to Unicode because of their semantical importance. If the removal of class attributes is supposed to preserve meaning, then it does not seem right to use this very attribute to encode different math alphabets.

-- 
Andersen
Received on Sunday, 11 June 2006 11:47:00 UTC