# Re: mover vs latin chars with diacriticals

From: Bruce Miller <bruce.miller@nist.gov>
Date: Mon, 01 May 2006 12:00:53 -0400
Message-ID: <445630B5.1000609@nist.gov>


White Lynx wrote:
>>By always using mover, you
>>achieve a more _uniform_ encoding, and you make the markup less
>>ambiguous
>
>
> Consider q-dot. In Unicode there is unique representation q&#x0307;
> (q followed by combining dot above). In MathML it could be
> 	<mover><mi>q</mi><mo>.</mo></mover>
> 	<mover><mi>q</mi><mi>.</mi></mover>
> 	<mover><mrow><mi>q</mi></mrow><mrow><mo>.</mo></mrow></mover>
> 	<munderover><mi>q</mi><mi/><mo>.</mo></munderover>
> 	<mover><mrow><mi>q</mi><mrow/></mrow><mo>.</mo></mover>
> 	<mover><mi>q</mi><mo>&#x00b7;</mo></mover>
> 	<mover><mi>q</mi><mo>&#x0307;</mo></mover>
> 	<mover><mi>q</mi><mo>&#x0387;</mo></mover>
> 	<mover><mi>q</mi><mo>&#x2024;</mo></mover>
> 	<mover><mi>q</mi><mo>&#x22C5;</mo></mover>
> I can't write all possibile ways of encoding q-dot in MathML
> explicitly as their number is not finite.

Not finite? I guess you're including an infinite nesting
of mrow's, presumably to demonstrate that it is a bad idea
to include a grouping construct in MathML.
Several of your examples are invalid Unicode; combining
characters are not to be used in isolation, which I would
consider <mo>&#x0307;</mo> to be.

Of the remaining infinite representations, several will actually
_look_ different. The author may very well have reasons for
choosing one over the other, although there's nothing inherent
in the markup that tells the reader what distinguishes them.
As has been discussed repeatedly on this list, this is either
a strength or weakness, depending on your point of view.
In any case, it appears that Unicode limits the author to
only one choice --- and still doesn't say what he/she _meant_; Hmm...

Your line of reasoning would seem to imply that _any_
math markup language, whether MathML, ISO12083, or yet
to be invented, would be fatally flawed unless it forbid
having fractions with 1 in the numerator and 2 in the denominator.
After all, Unicode already defines &#x00Bd;!

After filtering out the red herrings, you do suggest an
interesting possibility with your first two cases:
One could argue that
* with <mo>.</mo> an operation on q is implied,
possibly a time derivative,
whereas
* with <mi>.</mi> a composition is implied,
representing a single composed identifier.
(substitute "." with the glyph of your choice :> )

So, if we're taking votes :>
I like Richard Kayes interpretation that Unicode composition
would represent "weird atomic symbols", but MathML markup
would represent operations.  I would augment that with the
exception that markup like that described above could be used
to compose "weird atomic symbols" that might not have a Unicode
equivalent (eg. g&COMBINING BULLET OVER; )

Of course, these have to be in the form of "suggested usage",
rather than requirement, for all the reasons that have been
discussed here.

[...]
> Did you take into account that there are hundred ways of encoding base in MathML
> hundred ways of encoding dot (which dot?) and hundred ways of putting them together?

100^3\ll\infty

--
bruce.miller@nist.gov
http://math.nist.gov/~BMiller/

Received on Monday, 1 May 2006 16:01:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:27:37 UTC