RE: [MathML 4] Add rules to map from non-combining to combining accents

FWIW, RichEdit’s MathML reader uses the following table to translate from non-combining to combining marks

static const struct {WCHAR spacing; WCHAR comb_over; WCHAR comb_under;} SpacingToCombiningTable[] = {
       {0x2D8, 0x306, 0x32E}, // BREVE (TeX breve)
       {0xB8,  0x312, 0x327}, // CEDILLA
       {0x60,  0x300, 0x316}, // GRAVE ACCENT (TeX grave)
       {0x2D,  0x305, 0x332}, // HYPHEN-MINUS/OVERLINE (TeX bar)
       {0x2212,0x305, 0x332}, // MINUS SIGN/OVERLINE (TeX bar)
       {0x2E,  0x307, 0x323}, // FULL STOP/DOT ABOVE (TeX dot)
       {0x2D9, 0x307, 0x323}, // DOT ABOVE (TeX dot)
       {0x2DD, 0x30B, 0x2DD}, // DOUBLE ACUTE ACCENT (no "below" form)
       {0xB4,  0x301, 0x317}, // ACUTE ACCENT (TeX acute)
       {0x7E,  0x303, 0x330}, // TILDE (TeX tilde)
       {0x2DC, 0x303, 0x330}, // SMALL TILDE (TeX tilde)
       {0xA8,  0x308, 0x324}, // DIAERESIS (TeX ddot)
       {0x2C7, 0x30C, 0x32C}, // CARON (TeX check)
       {0x5E,  0x302, 0x32D}, // CIRCUMFLEX ACCENT (TEX hat)
       {0xAF,  0x305, 0},     // MACRON
       {0x5F,  0,     0x332}, // LOW LINE
       {0x2192, 0x20D7, 0x20EF}, // RIGHTWARDS ARROW (TeX vec)
       {0x27F6, 0x20D7, 0x20EF}, // LONG RIGHTWARDS ARROW (TeX vec)

PowerPoint and OneNote use RichEdit’s native MathML converters. Word uses xslt’s to import/export MathML omml2mml.xsl and mml2omml.xsl).

mml2omml.xsl contains

    { Non-combining, Upper-combining }
      {U+02D8, U+0306}, // BREVE
      {U+00B8, U+0312}, // CEDILLA
      {U+0060, U+0300}, // GRAVE ACCENT
      {U+002D, U+0305}, // HYPHEN-MINUS/OVERLINE
      {U+2212, U+0305}, // MINUS SIGN/OVERLINE
      {U+002E, U+0305}, // FULL STOP/DOT ABOVE
      {U+02D9, U+0307}, // DOT ABOVE
      {U+02DD, U+030B}, // DOUBLE ACUTE ACCENT
      {U+00B4, U+0301}, // ACUTE ACCENT
      {U+007E, U+0303}, // TILDE
      {U+02DC, U+0303}, // SMALL TILDE
      {U+00A8, U+0308}, // DIAERESIS
      {U+02C7, U+030C}, // CARON
      {U+005E, U+0302}, // CIRCUMFLEX ACCENT
      {U+00AF, U+0305}, // MACRON
      {U+005F, ::::::}, // LOW LINE
      {U+2192, U+20D7}, // RIGHTWARDS ARROW
      {U+27F6, U+20D7}, // LONG RIGHTWARDS ARROW
      {U+2190, U+20D6}, // LEFT ARROW

This is the same as for RichEdit, except that it adds a left-arrow mapping (probably I should update the RichEdit table). RichEdit has the comment

// TODO investigate if we need to translate the following characters
       //0x2190 LEFTWARDS ARROW
       //0x2194 LEFT RIGHT ARROW
       //0x294E LEFT BARB UP RIGHT BARB UP HARPOON
       //0x21BC LEFTWARDS HARPOON WITH BARB UPWARDS
       //0x21C0 RIGHTWARDS HARPOON WITH BARB UPWARDS
       //0x23DE TOP CURLY BRACKET
       //0x23B4 TOP SQUARE BRACKET
       //0x23DC TOP PARENTHESIS
       //0x20DB COMBINING THREE DOTS ABOVE
       //0x23DF BOTTOM CURLY BRACKET
       //0x23B5 BOTTOM SQUARE BRACKET
       //0x23DD BOTTOM PARENTHESIS

Thanks,
Murray

Received on Sunday, 4 November 2018 20:43:53 UTC