Re: mover vs latin chars with diacriticals (also MathML support)

> I thought that o + combining-diaeresis and ö were two different things in
> Unicode even when both are rendered equal. Of course, both are defined to
> be "canonically equivalent" via "canonical decomposition" but are not
> defined to be "equivalent".

You could not have a mathematical (or text) markup scheme that relied on
both forms being available, and inferred different semantics in the two
cases. The set of characters for which precomposed forms are available
is just a random ad hoc list based mainly on historical accident.
In general, when designing the markup language, you have to assume that
there is no precompomposed unicode character for the base+diacritic,
and if that is the case no such precomposed character will be added to
Unicode in later releases either:


http://www.unicode.org/faq/ligature_digraph.html#3

   At this point, the UTC has a default position: no new characters for
   digraphs or pre-composed diacritic letters should be accepted for
   encoding as individual characters. 

So in practice any argument about the exact relationship between the
prcomposed character and the sequence using combing characters is
irrelevant. In the vast majority of cases there is no precomposed
character. The precomposed characers cover a reasonable proportion of
the diacritic-base combinations used in European languages, but if you
are using a dot-above to denote derivatives you need that diacritic
(potentially) on any base letter.

David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

Received on Tuesday, 2 May 2006 17:16:02 UTC