Re: [MathML3-last-call] mathvariant from Jacques Distler on 2009-10-01 (www-math@w3.org from October 2009)

From: Jacques Distler <distler@golem.ph.utexas.edu>
Date: Thu, 1 Oct 2009 08:45:44 -0500
To: www-math@w3.org
Cc: Karl Tomlinson <w3@karlt.net>, Sam Dooley <sam@integretechpub.com>
Message-Id: <271E1B53-7927-42C9-A8EB-F98B036B9CB3@golem.ph.utexas.edu>

On Sep 28, 2009, at 12:04 PM, Sam Dooley wrote:

> So I agree that the wording of the spec should be clarified
> in a few places:
>
> (1) The spec should be more clear about when the mathvariant value
> identifies a distinct mathematical character, whether or not it has
> its own Unicode code point.  Perhaps just to say that mathvariant
> values can be used with Basic Latin (U+20-U+7E), Greek capital/
> small/symbol, dotless i/j and digamma BMP characters to encode a
> mathematical character.

I think the relevant notion here is of a "base character". What you  
seem to want the Spec to say is "@mathvariant should be ignored,  
unless it is applied to the 'base character'. The renderer should  
honour @mathvariant if the requested variant corresponds to an  
existing Unicode code-point. It may honour @mathvariant if the  
requested variant is otherwise available (say, as an alternate glyph  
in an available font)."

In your example,

"h" (U+0068) is the base character.

U+210E   (italic)
U+1D421  (bold)
U+1D489 (bold-italic)
U+1D4BD  (script)
U+1D4F1  (bold-script)
U+1D525  (fraktur)
U+1D58D  (bold-fraktur)
U+1D559  (double-struck)
U+1D5C1  (sans-serif)
U+1D629  (sans-serif-italic)
U+1D5F5  (sans-serif-bold)
U+1D65D  (sans-serif-bold-italic)
U+1D65D  (monospace)
are separate code-points for all the applicable variants.

In the case of "dotless i" (U+0131), only the italic variant (U+1D6A4)  
is available as a separate code-point. But nearly every font that  
provides U+0131 also provides a bold variant glyph. And many (21  
different fonts, on my laptop) provide a bold-italic variant. These  
variants are valid mathematical characters, and I think it would be  
crazy to enjoin renderers to ignore the @mathvariant attribute in  
those cases.

For Arabic letters, I assume the "base character" would be the  
isolated form:

U+FEA1 for hah

with U+FEA2-U+FEA4 ,being, respectively, the final, initial and medial  
variants.

N.b. the "tailed", "looped" and "stretched"  variants, listed in the  
Spec, do not exist as Unicode code points, nor as variants provided by  
any of the fonts on my system (though, admittedly, there may be  
specialized Arabic math fonts that I do not possess).

But if, as you suggest, the notion of "base character" is to be  
significant in the specification of @mathvariant, then --- at a  
minimum --- the Spec should spell out what the definition of "base  
character" is. For instance, is my assumption above, about Arabic  
letters, correct?

Second, the @mathvariant is applied to elements (e.g. <mi>), *not* to  
characters. So it would be worth spelling out how, e.g.,

     <mover><mi mathvariant="bold">pro&jmath;</mi><mo>&OverBar;</mo></ 
mover><mover>

is to be handled.

> These opinions represent my current understanding of the spec, and
> are almost certainly not shared by the working group.

I hope the Spec text can be clarified, along these or other lines.

Jacques

Received on Thursday, 1 October 2009 13:46:31 UTC