- From: Michael Day <mikeday@yeslogic.com>
- Date: Fri, 18 Apr 2003 12:00:38 +1000 (EST)
- To: Ernest Cline <ernestcline@mindspring.com>
- Cc: www-html@w3.org, <Donna.Worby@dardni.gov.uk>
Hi Ernest, > I strongly doubt that an 'Mc' character will ever be part of Unicode. > The Unicode view is that 'Mc' is what the standard refers to as a > grapheme, and as such it should be encoded as two characters 'M' and > 'c'. Existing multi-letter characters, sich as 'Dz' were included in > Unicode only because they existing in pre-UNICODE character sets and > were therefore included in Unicode to facilitate conversion between > those character sets and Unicode on a character for character basis. That's interesting. So, given that "Mc" is rendered differently and collated differently from the sequence of two characters "M" and "c", how should this be handled? Is it in fact an issue of script/language, in the same way that Spanish collates the character combinations "ll" and "ch" differently? Presumably then if the sequence "Mc" is encountered in text with language en-UK (or some other code?) it should be collated differently and rendered using a superscript c or other method. Surely there must be some existing standard for this? Michael Day YesLogic Pty. Ltd.
Received on Thursday, 17 April 2003 20:44:26 UTC