Re: [charmod-norm] Combining Diacritics on Same Side Can Commute

The change solves the problem I raised.  There are two niggles.

Is the term 'diacritic' to be taken as equivalent to the term 'mark'?  I have recently discovered that Unicode defines the property 'diacritic' differently, so that viramas are diacritics but Indic vowel marks are not.  I can't find a W3C definition of 'diacritic'.

Additionally, "doesn't matter to the presentation" should be "doesn't matter to canonical equivalence".  When not canonically equivalent, the order of marks can matter to the reading even though it does not affect the rendering!  A simple but heterodox example is <1A56, 1A6B> /lo/ v. <1A6B, 1A56> /on/.  (I was startled to find two examples of the latter in a dictionary.)  These marks are above and below.  For marks on the same side, above in this case, I could offer <1A65, 1A7B> and <1A7B, 1A65>, though I believe the possible renderings have different frequencies for the two codepoint orders.

-- 
GitHub Notification of comment by Richard57
Please view or discuss this issue at https://github.com/w3c/charmod-norm/issues/167#issuecomment-405076945 using your GitHub account

Received on Sunday, 15 July 2018 08:52:17 UTC