RE: [EXTERNAL] Rendering multi-symbol variable names and OpenType math fonts

Interesting question. In my experience as a theoretical physicist, variables are single letters that may be adorned by primes, accents, subscripts and superscripts. Multicharacter function names like “sin” are not italicized. Can you give an example of a multicharacter variable?

Thanks,
Murray

From: Alexis King <lexi.lambda@gmail.com>
Sent: Thursday, January 6, 2022 10:56 AM
To: www-math@w3.org
Subject: [EXTERNAL] Rendering multi-symbol variable names and OpenType math fonts

Hi all,

I apologize in advance if this question is not entirely on-topic, as I don’t know how much the nitty-gritty of actually rendering mathematics is within the purview of the Math WG. But math typesetting is generally a pretty niche topic, so I’m not sure if there’s a better place to ask such questions—please do let me know if there is!

As for my actual question: I am currently working on a math renderer, and it would be great to be able to render MathML input. It seems wise to take advantage of OpenType math fonts to assist with the typographical subtleties of math typesetting, so I have been trying to take that route so far. Unfortunately, the OpenType math specification bakes certain assumptions into the font metrics that seem to make this tricky.

For those unfamiliar, OpenType math fonts use glyphs in the Unicode Mathematical Alphanumeric Symbols block to represent variables. For example, instead of using U+0066 LATIN SMALL LETTER F from an italic font, the typesetting system must use 𝑓, which is U+1D453 MATHEMATICAL ITALIC SMALL F. The OpenType MATH table associates additional metrics with these glyphs to assist in placement of e.g. subscripts and superscripts.

While this mechanism works just fine for single-letter variable names, it does not work correctly for multi-letter variable names, as the glyphs in the Mathematical Alphanumeric Symbols block have wider spacing than their Basic Latin counterparts in the italic text font of the same family. This choice—inherited from TeX—helps to distinguish two variables juxtaposed in implicit multiplication from a single, multi-letter variable.

This is a problem, since one can easily write <mi mathvariant="italic">abc</mi>, which must result in a single, italic, multi-letter identifier. In TeX, the usual workaround is to use the \mathit command, which despite its name actually sets its argument in the current italic text font. However, MathML makes no such explicit distinction, so this results in two immediate questions:

  1.  When rendering, does it make sense to automatically use the special glyphs in a math font where possible and fall back to a text font depending on the contents of an <mi> element?
  2.  How should subscript/superscript attachment positions be determined for a multi-letter variable? Since it is not possible to use the math font for such identifiers, there is no information in the MATH table to use.

It’s worth explicitly noting that this issue is also present in TeX. As an example, try typesetting \[ f^1_2  \quad  \mathit{of}^1_2 \] with pdfLaTeX and observe that the first subscript “cuts into” the empty space below the f glyph (which is desirable), whereas the second does not.

(This is arguably an implementation infelicity in TeX, since TeX actually provides italic corrections for all glyphs, which could be used to place the subscript even in the second case. But OpenType fonts only include italic correction information in the MATH table, so ordinary characters have no such metrics. LuaTeX synthesizes italic corrections for other characters based on some heuristics in luaotfload, but that is getting pretty deep into the weeds.)
One could argue that perhaps this is really an OpenType question, not a MathML question, but I don’t think any “OpenType Math WG” exists, as the spec is long since finalized. Besides, even if it did, this interaction is still relevant to MathML, since it’s a clear point of friction between the MathML spec and what appears to be the de facto standard for math fonts (to the extent such a thing exists at all).

Regardless of whether this sort of thing is “in scope” for MathML, I’d appreciate any advice or pointers about the right way to proceed here. I’d really like to do the “right thing” to the extent that’s possible, but getting this stuff right is really subtle, and few people seem to have thought much about it.

Many thanks,
Alexis

Received on Thursday, 6 January 2022 21:46:02 UTC