- From: Rick Jelliffe <ricko@allette.com.au>
- Date: Mon, 18 Nov 1996 01:38:52 +1100
- To: "David G. Durand" <dgd@cs.bu.edu>
- CC: W3C SGML Working Group <w3c-sgml-wg@w3.org>
David G. Durand wrote: > I have yet to hear anyone offer an argument as to why a character > string describing missing data (an unknown glyph) is inferior to a number > describing missing data, especially when the web infrastructure does not > provide a convenient way to make the private arrangements needed for > private-use to work (funny property of a publishing medium, isn't it?). OK, lets decree that all names should be written in Chinese, since they are the most numerous people and characters. They have a phonetic system, so we can spell out any English words in bopomofo: in the West will only have to learn a few dozen characters, which is trivial, and we can figure out what a name means by looking in some (online) list with representative glyphs :-) Less flippantly, even simple English names are not clear. It took me a long time to realise Americans seem to mean what we call "hash" by "pound". So I don't have any confidence that English names are much use to most people in the world. That is the first reason why names are inferior. The second reason is that there are so many characters that giving them identifiers that also describe them means some of those identifiers must get very complicated, unless you adopt a Polish notation (semi-acronym contractions) like Microsoft advocates for C code. In which case you don't have a name anyway, since you need to know the contraction. The Omega thread earlier shows how deceptive it can be to use meaningful-looking identifiers. The third reason is that the way people seem to like to handle lots of characters is to use a "Keycaps" utility applet, like Windows and Macs (and FrameMaker on UNIX) provide. In this case, the user doesn't care what the identifier is. So the main concern is for machine efficiency rather than readability, IMHO, since users increasingly won't edit in dumb text editors. So both our points may be moot (but my point is less moot than yours :-). You mention 'unknown glyphs', and I think you really mean arbitrary glyphs not specifiable by ISO10646+markup+stylesheet/DSSSL+locale. I have been assuming that XML was targetted at 'resolved' (closed-system) data, not (open-system) system-independant data. In a closed-system, resolution of arbitrary foreign glyphs must be a transparent function of the application, not an XML parser function that requires any user's intervention. Which sounds to me like something better handled in-place in element markup rather than entity markup, given the rarity of the glyphs, and in the spirit of element-set-less simplicity from HTML (or is that spectre?) -- Regards Rick Jelliffe email: ricko@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 262 4777 fax: +61 2 262 4774 _______________________________________________________________
Received on Sunday, 17 November 1996 09:35:35 UTC