- From: Rick Jelliffe <ricko@allette.com.au>
- Date: Tue, 10 Jul 2001 20:17:24 +0800
- To: <xml-dev@lists.xml.org>
- Cc: "www-xml-blueberry-comments" <www-xml-blueberry-comments@w3.org>
In Unicode 3.1 there are added special function characters for allowing new characters to be composed positionally from parts. These are intended for very rare or new characters only. There has been several thousand of years of research into what the primitive components of Han ideographs are. It is only now that we have computers and large databases of characters that it is feasible to try out different alternatives. At Academia Sinica, for example, my friend Prof. C.C. Hsieh devised a system with about 600 components and I think 16 composition functions (side-by-side) which can represent about 98% of the Hanyu lexicon. Unicode went with a simpler set of functions, but at the expense that the functions allow some ambiguity: there may be more than one way to represent the same character. This may be fine for text, but not good for names where normalization and comparison is their destiny. (I don't think these function characters are suitable for use in names, b.t.w.) Cheers Rick Jelliffe From: "Joel Rees" <rees@server.mediafusion.co.jp> Oh. I thought of another way around this issue. It is not presently a very satisfying solution, but may be the ultimate solution, if it would work: Are ideographic sequences allowed in markup (tags and attributes)? I mean sequences of existing characters with the ideographic description characters mixed in to show how they are supposed to combine. If so, some truly sophisticated editor of the future would be able to build virtually any character that can be built from the current set of radicals, and we would be able to do with Japanese the equivalent of using "mellyfluus" (the misspelling) in an attribute.
Received on Tuesday, 10 July 2001 06:11:24 UTC