Rendering multi-symbol variable names and OpenType math fonts

Hi all,

I apologize in advance if this question is not entirely on-topic, as I
don’t know how much the nitty-gritty of actually *rendering* mathematics is
within the purview of the Math WG. But math typesetting is generally a
pretty niche topic, so I’m not sure if there’s a better place to ask such
questions—please do let me know if there is!

As for my actual question: I am currently working on a math renderer, and
it would be great to be able to render MathML input. It seems wise to take
advantage of OpenType math fonts to assist with the typographical
subtleties of math typesetting, so I have been trying to take that route so
far. Unfortunately, the OpenType math specification bakes certain
assumptions into the font metrics that seem to make this tricky.

For those unfamiliar, OpenType math fonts use glyphs in the Unicode
Mathematical Alphanumeric Symbols block to represent variables. For
example, instead of using U+0066 LATIN SMALL LETTER F from an italic font,
the typesetting system must use 𝑓, which is U+1D453 MATHEMATICAL ITALIC
SMALL F. The OpenType MATH table associates additional metrics with these
glyphs to assist in placement of e.g. subscripts and superscripts.

While this mechanism works just fine for single-letter variable names, it
does not work correctly for multi-letter variable names, as the glyphs in
the Mathematical Alphanumeric Symbols block have wider spacing than their
Basic Latin counterparts in the italic text font of the same family. This
choice—inherited from TeX—helps to distinguish two variables juxtaposed in
implicit multiplication from a single, multi-letter variable.

This is a problem, since one can easily write <mi
mathvariant="italic">abc</mi>, which must result in a single, italic,
multi-letter identifier. In TeX, the usual workaround is to use the \mathit
command, which despite its name actually sets its argument in the current
italic *text* font. However, MathML makes no such explicit distinction, so
this results in two immediate questions:

   1. When rendering, does it make sense to automatically use the special
   glyphs in a math font where possible and fall back to a text font depending
   on the contents of an <mi> element?

   2. How should subscript/superscript attachment positions be determined
   for a multi-letter variable? Since it is not possible to use the math font
   for such identifiers, there is no information in the MATH table to use.

   It’s worth explicitly noting that this issue is also present in TeX. As
   an example, try typesetting \[ f^1_2  \quad  \mathit{of}^1_2 \] with
   pdfLaTeX and observe that the first subscript “cuts into” the empty space
   below the f glyph (which is desirable), whereas the second does not.

   (This is arguably an implementation infelicity in TeX, since TeX
   actually provides italic corrections for all glyphs, which could be used to
   place the subscript even in the second case. But OpenType fonts only
   include italic correction information in the MATH table, so ordinary
   characters have no such metrics. LuaTeX synthesizes italic corrections for
   other characters based on some heuristics in luaotfload, but that is
   getting pretty deep into the weeds.)

One could argue that perhaps this is really an OpenType question, not a
MathML question, but I don’t think any “OpenType Math WG” exists, as the
spec is long since finalized. Besides, even if it did, this interaction is
still relevant to MathML, since it’s a clear point of friction between the
MathML spec and what appears to be the de facto standard for math fonts (to
the extent such a thing exists at all).

Regardless of whether this sort of thing is “in scope” for MathML, I’d
appreciate any advice or pointers about the right way to proceed here. I’d
really like to do the “right thing” to the extent that’s possible, but
getting this stuff right is really subtle, and few people seem to have
thought much about it.

Many thanks,
Alexis

Received on Thursday, 6 January 2022 21:36:18 UTC