- From: Mark Crispin <MRC@Panda.COM>
- Date: Thu, 13 Jun 1996 21:36:53 -0700 (PDT)
- To: Larry Masinter <masinter@parc.xerox.com>
- Cc: mohta@necom830.hpcl.titech.ac.jp, glenn@spyglass.com, Lueko.Willms@t-online.de, borka@e5.ijs.si, ietf-charsets@INNOSOFT.COM, iab-charsets@bunyip.com
On Thu, 13 Jun 1996 20:51:45 PDT, Larry Masinter wrote: > If people don't stop baiting each other, I'm gonna unsubscribe. > > Frankly, I don't have a lot of hope for "rough consensus" on the > document at hand if we continue to bicker about stuff that isn't even > in the document. Gentlemen, please! This message is mostly addressed to Ohta-san. Ohta-san, you can rest assured that your position is heard at these meetings. I make sure of that, personally, at every meeting which I attend. Larry, Glenn, et al can attest to that. I share some of your viewpoints, although not all of them. I also see the other side of this issue. Han unification is more than "fit everything in 16 bits"; there are benefits gained from it as well as capabilities lost. For this reason, I think that the only way to move forward is to strike a compromise. A compromise requires amelioration of the problems so that we can all get something that works. This does mean sensitizing the Unicoders to some of the real problems that are faced by us in the plaintext world (and doing so repeatedly). But we, in turn, must also be sensitized to their problems. And remember: ISO 10646 is definitely not the best universal coded character set; it's the *only* universal coded character set. Unicode has abundant warts. Han unification is only one of them. But whenever you find warts, you will find people who will call those warts "features". Whether something is a "wart" or a "feature" is a matter of opinion; and in 40 years I've learned that opinions are like anuses; everyone has one. It's pointless to discuss something that (1) is a matter of opinion (2) no concensus is possible (3) is completely unnecessary to come to concensus. We can all keep our opinions, and still come up with a technical spec that works. In your paper at Tsukuba, you defined plaintext as "text with finite state structure". I'm not convinced that ISO 2022 based solutions, such as your proposal (ISO-2022-JP2 or ISO-2022-INT), are any more "finite state" than an ISO 10646 based solution with "language tags" (I'm putting this term in quotes deliberately, see below). I am, however, convinced that an ISO 10646 based solution must have a reversible mapping with ISO-2022-JP2. It is obvious that raw Unicode is not sufficient for this purpose. "Language tags" is a misnomer; we actually are not shifting languages, we are shifting through various alternate glyphs for a particular character. Kobayashi-san from Justsystem suggests "source ID", although this too many have nomenclature problems. Let's not bother with what they are called, and for the time being let's call them "blurdybloops" instead. So, what is the definition of a "blurdybloop"? This is not a trivial problem. Kobayashi-san proposed three to start (one for each of C, J, and K). I think that we need more than three, but I also that the total number of blurdybloops is probably less than 25. Do you think that you can help us with this definition? You have considerable talent and knowledge of East Asian language issues, and I think that you could help enormously if you wanted to. Would you please help? --Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Thursday, 13 June 1996 22:15:06 UTC