- From: Jon Bosak <bosak@atlantic-83.Eng.Sun.COM>
- Date: Tue, 25 Mar 1997 22:29:08 -0800
- To: w3c-sgml-wg@w3.org
[Peter Flynn:] | Remaining question: do we hardwire ISOLat1, ISOLat2, ISOPub and | ISOTech as being the four most widely-used sets of non-ASCII symbols | in general? This is part of the problem. We went down this road in November and withdrew in horror when we saw where it led. (The person who finally blew the whistle was Anders Berglund, who defined all these ISO sets in the first place.) To put it briefly, there is a lot of overlap among these entity sets, they don't map unambiguously to the Unicode character set, and they don't map cleanly to existing fonts (in particular the very widely deployed Adobe-based "symbol" set), in addition to being 100.00 percent Eurocentric. This approach leads into a morass. So where do you draw the line? Including all of them is a nightmare, and including less than that is arbitrary. Some have said that people are too used to < to accept a numeric reference. Well, I have bad news: a whole lot of people are used to é and —, too. Any case for including predefined entities because it's what people are used to is a case for including all of them. And you can't include all of them because you end up with a stinking mess. We have decided to go with Unicode precisely in order to get beyond all this. Let's get it over with. Jon
Received on Wednesday, 26 March 1997 01:29:11 UTC