- From: Murray Sargent <murrays@exchange.microsoft.com>
- Date: Fri, 23 Mar 2018 23:06:09 +0000
- To: David Carlisle <davidc@nag.co.uk>, "www-math@w3.org" <www-math@w3.org>
- CC: Frédéric WANG <fred.wang@free.fr>
- Message-ID: <SN2PR00MB0176380F1AA1E90FFE9A380487A80@SN2PR00MB0176.namprd00.prod.outlook.com>
FWIW, Microsoft Office math has always used combining marks in U+0300..036F and U+20D0..U+20F1 ranges. But translations to popular spacing accents are used in the MathML converters. Here’s part of a table used for UnicodeMath
{0x0300, lsmservbcAccentAbove}, //grave # COMBINING GRAVE ACCENT
{0x0301, lsmservbcAccentAbove}, //acute # COMBINING ACUTE ACCENT
{0x0302, lsmservbcAccentAbove}, //flex # COMBINING CIRCUMFLEX ACCENT
{0x0303, lsmservbcAccentAbove}, //tilde # COMBINING TILDE
{0x0304, lsmservbcAccentAbove}, //macron # COMBINING MACRON
{0x0305, lsmservbcAccentAbove}, //overline# COMBINING OVERLINE
{0x0306, lsmservbcAccentAbove}, //breve # COMBINING BREVE
{0x0307, lsmservbcAccentAbove}, //dot # COMBINING DOT ABOVE
{0x0308, lsmservbcAccentAbove}, // # COMBINING DIAERESIS
{0x0309, lsmservbcAccentAbove}, // # COMBINING HOOK ABOVE
{0x030A, lsmservbcAccentAbove}, // # COMBINING RING ABOVE
{0x030B, lsmservbcAccentAbove}, // # COMBINING DOUBLE ACCUTE ACCENT
{0x030C, lsmservbcAccentAbove}, // # COMBINING CARON
{0x030D, lsmservbcAccentAbove}, // # COMBINING VERTICAL LINE ABOVE
{0x030E, lsmservbcAccentAbove}, // # COMBINING DOUBLE VERTICAL LINE ABOVE
{0x030F, lsmservbcAccentAbove}, // # COMBINING DOUBLE GRAVE ACCENT
{0x0310, lsmservbcAccentAbove}, // # COMBINING CANDRABINDU
{0x0311, lsmservbcAccentAbove}, // # COMBINING INVERTED BREVE
{0x0312, lsmservbcAccentAbove}, // # COMBINING TURNED COMMA ABOVE
{0x0313, lsmservbcAccentAbove}, // # COMBINING COMMA ABOVE
{0x0314, lsmservbcAccentAbove}, // # COMBINING REVERSED COMMA ABOVE
{0x0315, lsmservbcAccentAbove}, // # COMBINING COMMA ABOVE RIGHT
{0x0316, lsmservbcAccentBelow}, // # COMBINING GRAVE ACCENT BELOW
{0x0317, lsmservbcAccentBelow}, // # COMBINING ACUTE ACCENT BELOW
{0x0318, lsmservbcAccentBelow}, // # COMBINING LEFT TACK BELOW
{0x0319, lsmservbcAccentBelow}, // # COMBINING RIGHT TACK BELOW
{0x031A, lsmservbcAccentAbove}, // # COMBINING LEFT ANGLE ABOVE
{0x031B, lsmservbcAccentAbove}, // # COMBINING HORN
{0x031C, lsmservbcAccentBelow}, // # COMBINING LEFT HALF RING BELOW
{0x031D, lsmservbcAccentBelow}, // # COMBINING UP TACK BELOW
{0x031E, lsmservbcAccentBelow}, // # COMBINING DOWN TACK BELOW
{0x031F, lsmservbcAccentBelow}, // # COMBINING PLUS SIGN BELOW
{0x0320, lsmservbcAccentBelow}, // # COMBINING MINUS SIGN BELOW
{0x0321, lsmservbcAccentBelow}, // # COMBINING PALATALIZED HOOK BELOW
{0x0322, lsmservbcAccentBelow}, // # COMBINING RETROFLEX HOOK BELOW
{0x0323, lsmservbcAccentBelow}, // # COMBINING DOT BELOW
{0x0324, lsmservbcAccentBelow}, // # COMBINING DIAERESIS BELOW
{0x0325, lsmservbcAccentBelow}, // # COMBINING RING BELOW
{0x0326, lsmservbcAccentBelow}, // # COMBINING COMMA BELOW
{0x0327, lsmservbcAccentBelow}, // # COMBINING CEDILLA
{0x0328, lsmservbcAccentBelow}, // # COMBINING OGONEK
{0x0329, lsmservbcAccentBelow}, // # COMBINING VERTICAL LINE BELOW
{0x032A, lsmservbcAccentBelow}, // # COMBINING BRIDGE BELOW
{0x032B, lsmservbcAccentBelow}, // # COMBINING INVERTED DOUBLE ARCH BELOW
{0x032C, lsmservbcAccentBelow}, // # COMBINING CARON BELOW
{0x032D, lsmservbcAccentBelow}, // # COMBINING CIRCUMFLEX ACCENT BELOW
{0x032E, lsmservbcAccentBelow}, // # COMBINING BREVE BELOW
{0x032F, lsmservbcAccentBelow}, // # COMBINING INVERTED BREVE BELOW
{0x0330, lsmservbcAccentBelow}, // # COMBINING TILDE BELOW
{0x0331, lsmservbcAccentBelow}, // # COMBINING MACRON BELOW
{0x0332, lsmservbcAccentBelow}, // # COMBINING LOW LINE
{0x0333, lsmservbcAccentBelow}, // # COMBINING DOUBLE LOW LINE
{0x0337, lsmservbcAccentAbove}, // # COMBINING SHORT SOLIDUS OVERLAY
{0x0338, lsmservbcAccentAbove}, // # COMBINING LONG SOLIDUS OVERLAY
{0x0339, lsmservbcAccentBelow}, // # COMBINING RIGHT HALF RING BELOW
{0x033A, lsmservbcAccentBelow}, // # COMBINING INVERTED BRIDGE BELOW
{0x033B, lsmservbcAccentBelow}, // # COMBINING SQUARE BELOW
{0x033C, lsmservbcAccentBelow}, // # COMBINING SEAGULL BELOW
{0x033D, lsmservbcAccentAbove}, // # COMBINING X ABOVE
{0x033E, lsmservbcAccentAbove}, // # COMBINING VERTICAL TILDE
{0x033F, lsmservbcAccentAbove}, // # COMBINING DOUBLE OVERLINE
{0x0340, lsmservbcAccentAbove}, // # COMBINING GRAVE TONE MARK
{0x0341, lsmservbcAccentAbove}, // # COMBINING ACUTE TONE MARK
{0x0342, lsmservbcAccentAbove}, // # COMBINING GREEK PERISPOMENI
{0x0343, lsmservbcAccentAbove}, // # COMBINING GREEK KORONIS
{0x0344, lsmservbcAccentAbove}, // # COMBINING GREEK DIALYTIKA TONOS
{0x0345, lsmservbcAccentBelow}, // # COMBINING GREEK YPOGEGRAMMENI
{0x0346, lsmservbcAccentAbove}, // # COMBINING BRIDGE ABOVE
{0x0347, lsmservbcAccentBelow}, // # COMBINING EQUALS SIGN BELOW
{0x0348, lsmservbcAccentBelow}, // # COMBINING DOUBLE VERTICAL LINE BELOW
{0x0349, lsmservbcAccentBelow}, // # COMBINING LEFT ANGLE BELOW
{0x034A, lsmservbcAccentAbove}, // # COMBINING NOT TILDE ABOVE
{0x034B, lsmservbcAccentAbove}, // # COMBINING HOMOTHETIC ABOVE
{0x034C, lsmservbcAccentAbove}, // # COMBINING ALMOST EQUAL TO ABOVE
{0x034D, lsmservbcAccentBelow}, // # COMBINING LEFT RIGHT ARROW BELOW
{0x034E, lsmservbcAccentBelow}, // # COMBINING UPWARDS ARROW BELOW
{0x0350, lsmservbcAccentAbove}, // # COMBINING RIGHT ARROWHEAD ABOVE
{0x0351, lsmservbcAccentAbove}, // # COMBINING LEFT HALF RING ABOVE
{0x0352, lsmservbcAccentAbove}, // # COMBINING FERMATA
{0x0353, lsmservbcAccentBelow}, // # COMBINING X BELOW
{0x0354, lsmservbcAccentBelow}, // # COMBINING LEFT ARROWHEAD BELOW
{0x0355, lsmservbcAccentBelow}, // # COMBINING RIGHT ARROWHEAD BELOW
{0x0356, lsmservbcAccentBelow}, // # COMBINING RIGHT ARROWHEAD AND UP ARROWHEAD BELOW
{0x0357, lsmservbcAccentAbove}, // # COMBINING RIGHT HALF RING ABOVE
{0x0358, lsmservbcAccentAbove}, // # COMBINING DOT ABOVE RIGHT
{0x0359, lsmservbcAccentBelow}, // # COMBINING ASTERISK BELOW
{0x035A, lsmservbcAccentBelow}, // # COMBINING DOUBLE RING BELOW
{0x035B, lsmservbcAccentAbove}, // # COMBINING ZIGZAG ABOVE
{0x0363, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER A
{0x0364, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER E
{0x0365, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER I
{0x0366, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER O
{0x0367, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER U
{0x0368, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER C
{0x0369, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER D
{0x036A, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER H
{0x036B, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER M
{0x036C, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER R
{0x036D, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER T
{0x036E, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER V
{0x036F, lsmservbcAccentAbove}, // # COMBINING LATIN SMALL LETTER X
{0x20D0, lsmservbcAccentAbove}, // # COMBINING LEFT HARPOON ABOVE
{0x20D1, lsmservbcAccentAbove}, // # COMBINING RIGHT HARPOON ABOVE
{0x20D4, lsmservbcAccentAbove}, // # COMBINING ANTICLOCKWISE ARROW ABOVE
{0x20D5, lsmservbcAccentAbove}, // # COMBINING CLOCKWISE ARROW ABOVE
{0x20D6, lsmservbcAccentAbove}, // # COMBINING LEFT ARROW ABOVE
{0x20D7, lsmservbcAccentAbove}, // # COMBINING RIGHT ARROW ABOVE
{0x20DB, lsmservbcAccentAbove}, // # COMBINING THREE DOTS ABOVE
{0x20DC, lsmservbcAccentAbove}, // # COMBINING FOUR DOTS ABOVE
{0x20E1, lsmservbcAccentAbove}, // # COMBINING LEFT RIGHT ARROR ABOVE
{0x20E8, lsmservbcAccentBelow}, // # COMBINING TRIPLE UNDERDOT
{0x20E9, lsmservbcAccentAbove}, // # COMBINING WIDE BRIDGE ABOVE
{0x20EC, lsmservbcAccentBelow}, // # COMBINING RIGHTWARDS HARPOON WITH BARB DOWNWARDS
{0x20ED, lsmservbcAccentBelow}, // # COMBINING LEFTWARDS HARPOON WITH BARB DOWNWARDS
{0x20EE, lsmservbcAccentBelow}, // # COMBINING LEFT ARROW BELOW
{0x20EF, lsmservbcAccentBelow}, // # COMBINING RIGHT ARROW BELOW
{0x20F0, lsmservbcAccentAbove}, // # COMBINING ASTERISK ABOVE
From: David Carlisle <davidc@nag.co.uk>
Sent: Friday, March 23, 2018 6:46 AM
To: www-math@w3.org
Subject: Re: [MathML 4] Add rules to map from non-combining to combining accents
On 23/03/2018 13:38, Frédéric WANG wrote:
>
>>
>> Yes we should say something (as it happens TeX also has difficulties
>> with combining characters)
>>
>> somewhere around
>>
>> https://w3c.github.io/mathml/chapter7.html#chars.comb-chars<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fw3c.github.io%2Fmathml%2Fchapter7.html%23chars.comb-chars&data=04%7C01%7Cmurrays%40exchange.microsoft.com%7C3bbc50d07d86450def0708d590c4e1fc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636574097694222199%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=v1gAvJjhiISxcCaLm7OsuXKpIrwERmTqqI1cAny%2B52I%3D&reserved=0>
>>
>> I guess is the place to add something.
>>
>> As you hint I may need to add some extra data to unicode.xml to specify
>> which characters are related in this way, I don't think the existing
>> Unicode data reliably says which are equivalent combining/non combining
>> forms although obviously taking the character name and deleting
>> "Combining" gives a first approximation of the mapping.
>>
>>
>> David
>
> Hi David,
>
> Indeed, I would prefer an explicit list for better interoperability. Are
> you able to come up with one list, maybe doing the change in the repo of
> unicode.xml? For you information, below are the available horizontal
> stretchy constructions available in some popular OpenType MATH fonts. I
> believe other fonts also only provide constructions for combining
> versions of accents.
>
> ...
I'll see what I can do.....
David
Disclaimer
The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business.
Received on Friday, 23 March 2018 23:06:40 UTC