- From: Sam Dooley <sam@integretechpub.com>
- Date: Mon, 28 Sep 2009 11:04:41 -0600
- To: Karl Tomlinson <w3@karlt.net>, www-math@w3.org
- Cc: Jacques Distler <distler@golem.ph.utexas.edu>
The intent of the mathvariant attribute is to provide a markup solution to represent mathematical characters in a way that protects them from accidental style changes. There are three cases: (1) Mathematical characters that have assigned code points in the SMP. The mathvariant attribute provides a way to encode these characters using only 16-bit data values. So <mi mathvariant="bold">A</mi> means the same as <mi>𝐀</mi>. (2) Mathematical characters that have assigned code points in the BMP. These are the "holes" in the alphabets in the SMP, because they were deemed equivalent to characters that were already in the BMP. The mathvariant attribute provides an alternate way to encode these characters, even though they really don't need it. The exact list of these characters is given in chapter 7. So <mi mathvariant="script">h</mi> means the same as <mi>ℛ</mi>. (3) Mathematical characters that have no assigned code point. The mathvariant attribute provides a way to encode characters that could not be encoded otherwise. So "bold italic dotless i" does not have an assigned code point, but makes perfect sense (to me at least) as a mathematical character. A renderer should feel free to do the reasonable thing if it sees <mi mathvariant="bold-italic">ı</mi> and it has a font that contains it. A "sans-serif alpha" as a mathematical character is another example. Each implementor should choose what combinations to support, but I could make different choices than someone else. We should be able to agree on characters related to the alphabets in U+1D400-U+1D7FF, as described above, but beyond that I would expect support to vary, perhaps widely. So I agree that the wording of the spec should be clarified in a few places: (1) The spec should be more clear about when the mathvariant value identifies a distinct mathematical character, whether or not it has its own Unicode code point. Perhaps just to say that mathvariant values can be used with Basic Latin (U+20-U+7E), Greek capital/ small/symbol, dotless i/j and digamma BMP characters to encode a mathematical character. (2) The spec should not say to ignore mathematical characters that do not have an assigned code point, such as bold italic dotless i or sans-serif alpha, but should warn that such characters may not be widely supported by existing fonts. (3) The spec should provide a definition of mathvariant that is not dependent on a specific version of Unicode, other than to say that if new Unicode characters are introduced in the future, they may be considered to be equivalent to an existing mathvariant encoding for the character. (4) The spec should clarify that the intent is not to transform character code points from the BMP into the SMP, but to provide a markup solution to represent mathematical characters that may or may not have a simpler representation as a Unicode code point. (5) The spec should clarify that when the mathvariant attribute is applied to a Unicode code point that already identifies a mathematical character, that the mathvariant implied by the code point overrides any external mathvariant value. So <mi mathvariant="bold">ℎ</mi> should be equivalent to <mi mathvariant="italic">h</mi>, as implied by U+210E, and if you want the bold italic h, use <mi mathvariant="bold-italic">h</mi> or (equivalently) <mi>𝒉</mi>. In a sense, the intended mapping could be described as transforming from (BMP + SMP) into (mathvariant x BMP), where the reverse mapping is not always defined, even for the alphabets of interest. But I don't mean to suggest that any specific implementation should turn out to be better or worse than any other. These opinions represent my current understanding of the spec, and are almost certainly not shared by the working group. Sam
Received on Monday, 28 September 2009 17:18:02 UTC