Re: [MathML3-last-call] mathvariant from Karl Tomlinson on 2009-09-28 (www-math@w3.org from September 2009)

From: Karl Tomlinson <w3@karlt.net>
Date: Mon, 28 Sep 2009 18:21:47 +1300
To: www-math@w3.org
CC: Jacques Distler <distler@golem.ph.utexas.edu>
Message-ID: <87eipriqz8.fsf@karlt.net>
On Thu, 24 Sep 2009 17:27:21 -0500, Jacques Distler wrote:

> I really don't like the philosophy behind the paragraph on
> @mathvariant, specifically, the statement [1]:
>
>    By design, the only cases that have an unambiguous
>    interpretation are exactly the ones that correspond to SMP
>    Math Alphanumeric Symbol characters, which are enumerated in
>    Section 7.5 Mathematical Alphanumeric Symbols. The
>    mathvariant values "initial", "tailed", "looped" and
>    "stretched" are expected to apply only to Arabic
>    characters. In all other cases, it is suggested that
>    renderers ignore the value of the mathvariant attribute if it
>    is present.
>
> There are three ways a renderer can satisfy a request for a
> particular  mathvariant.
>
> [...]
>
> I don't think the Spec should be micromanaging how a renderer is
> supposed to deal with @mathvariant.

I didn't read this as micromanaging *how* a renderer should
satisfy the request for a particular mathvariant, but rather
*when* the mathvariant value will affect the character.

I think it is important that the spec is clear and specific about
*when* the mathvariant value has an effect because this attribute
can change the character (and thus meaning) not just the style.
Also, the default value for mathvariant is very often "italic",
but not all single-character identifiers should be italicized.

However, I don't find the MathML3 spec as clear as MathML2.

In MathML2, the characters affected by mathvariant were listed in
http://www.w3.org/TR/2003/REC-MathML2-20031021/chapter6.html#chars.letter-like-tables

In MathML3, although
http://www.w3.org/TR/MathML3/chapter3.html#presm.commatt says that
refers to "SMP Math Alphanumeric Symbol characters, which are
enumerated in Section 7.5 Mathematical Alphanumeric Symbols", that
section does not really enumerate the characters.

Section 7.5 does talk about some sets of characters, but I'm left
with doubts about which character mathvariant will affect.

Section 7.5 refers to Unicode 3.1.  Does this mean that the
mathvariant attribute should only be applied if there exists a
corresponding character in that version of Unicode?

In Unicode 5.1, there are some additional characters in the
Mathematical Alphanumeric Symbols block:

U+1D6A4 MATHEMATICAL ITALIC SMALL DOTLESS I
U+1D6A5 MATHEMATICAL ITALIC SMALL DOTLESS J
U+1D7CA MATHEMATICAL BOLD CAPITAL DIGAMMA
U+1D7CB MATHEMATICAL BOLD SMALL DIGAMMA

Should a mathvariant attribute now transform the following
characters?:

U+0131 LATIN SMALL LETTER DOTLESS I
U+0237 LATIN SMALL LETTER DOTLESS J
U+03DC GREEK LETTER DIGAMMA
U+03DD GREEK SMALL LETTER DIGAMMA

If so, the meaning of these characters in MathML3 would sometimes
be different from in MathML2.

Perhaps the transformation makes sense and won't cause problems
for these characters, but what if more characters are added to the
Unicode block in the future?


The other doubt that I have is with the "certain characters that
were already present in Unicode in the BMP" but "not in the
'expected' sequence in Plane 1."

http://www.w3.org/TR/MathML3/chapter3.html#presm.commatt says
"exactly the ones that correspond to SMP Math Alphanumeric Symbol
characters".

http://www.w3.org/TR/MathML3/chapter7.html#chars.BMP-SMP says
"In this section we explain the correspondence that a MathML
processor should apply between certain characters in Plane 0 (BMP)
of Unicode, modified by the mathvariant attribute, and the Plane 1
Mathematical Alphanumeric Symbol characters"

http://www.w3.org/TR/MathML3/chapter3.html#presm.symbolchars says
"MathML defines a correspondence between token elements with
certain combinations of BMP character data and the mathvariant
attribute and tokens containing SMP Math Alphanumeric Symbol
characters."

These texts seem to suggest that the transformation is only
applied when the Mathematical Alphanumeric Symbol character is in
Plane 1.  Also, reading the surrounding text I get the impression
that the purpose of the mathvariant attribute is specifying SMP
characters on systems that only support BMP characters, and so
mathvariant is not necessary for styled characters in Plane 0.

When I get to "The exact correspondence between a mathematical
alphabetic character and an unstyled character is complicated by
the fact that certain characters that were already present in
Unicode in the BMP", I realize that this is complicated, but there
is no direct instruction re what to do with these characters.

I wonder whether it complicated because the renderer must ensure
not to transform to the holes in the Mathematical Alphanumeric
Symbols block, or because the renderer should transform the
character to the styled BMP character.

Because these BMP characters were enumerated in MathML2, I assume
that the same is intended in MathML3, but because this is not
clearly stated, the other quoted texts raise doubts.

Could the text be changed from just saying that the correspondence
"is complicated" to saying something like:

"Also, for a logical class and unstyled base character where the
corresponding Mathematical Alphanumeric Symbol character is not in
the 'expected' sequence in Plane 1 because it was already present
in Unicode in the BMP, MathML processors should in the same way
treat the base character as the styled character."

(if that is the behavior expected.)

> If the renderer has the desired glyph available to it, it should
> make  use of it (whether through mechanism (1) or (2)). If it
> doesn't have  the desired glyph available, "MathMLforCSS" is not
> magically going to  solve the problem.

I think you are right there.

> As a specific example, the current Spec says that
>
>    <mi mathvariant="bold-italic">&imath;</mi>
>
> should be ignored, despite the fact that
>
> a) Many fonts (STIXGeneral, Verdana, Trebuchet, Times, Palatino,
> ...),  have a bold-italic variant-glyph for U+0131.
> b) This is a perfectly reasonable variant to request.

Is this a perfectly reasonable variant to request because a bold
italic dotless is commonly used with a different meaning from
other dotless i characters and different meaning from U+1D48A
MATHEMATICAL BOLD ITALIC SMALL I?

If so, then it should probably have a Unicode assignment in this
block.

There is still the question though as to whether the addition of
new Unicode characters should change the treatment of existing
characters.  This is not really such a problem with bold-italic
because bold-italic would rarely be specified without expecting a
transformation, but could be a problem with the italic mathvariant
that is an implicit default.
Received on Monday, 28 September 2009 05:22:34 UTC