Re: Call for proposals from Mark Crispin on 1996-06-14 (ietf-charsets@w3.org from April to June 1996)

From: Mark Crispin <MRC@Panda.COM>
Date: Thu, 13 Jun 1996 21:36:53 -0700 (PDT)
To: Larry Masinter <masinter@parc.xerox.com>
Cc: mohta@necom830.hpcl.titech.ac.jp, glenn@spyglass.com, Lueko.Willms@t-online.de, borka@e5.ijs.si, ietf-charsets@INNOSOFT.COM, iab-charsets@bunyip.com
Message-id: <MailManager.834727013.24422.mrc@Tomobiki-Cho.CAC.Washington.EDU>

On Thu, 13 Jun 1996 20:51:45 PDT, Larry Masinter wrote:
> If people don't stop baiting each other, I'm gonna unsubscribe.
>
> Frankly, I don't have a lot of hope for "rough consensus" on the
> document at hand if we continue to bicker about stuff that isn't even
> in the document.

Gentlemen, please!

This message is mostly addressed to Ohta-san.

Ohta-san, you can rest assured that your position is heard at these meetings.
I make sure of that, personally, at every meeting which I attend.  Larry,
Glenn, et al can attest to that.  I share some of your viewpoints, although
not all of them.

I also see the other side of this issue.  Han unification is more than "fit
everything in 16 bits"; there are benefits gained from it as well as
capabilities lost.  For this reason, I think that the only way to move forward
is to strike a compromise.

A compromise requires amelioration of the problems so that we can all get
something that works.  This does mean sensitizing the Unicoders to some of the
real problems that are faced by us in the plaintext world (and doing so
repeatedly).  But we, in turn, must also be sensitized to their problems.

And remember:
	ISO 10646 is definitely not the best universal coded character set;
	it's the *only* universal coded character set.

Unicode has abundant warts.  Han unification is only one of them.  But
whenever you find warts, you will find people who will call those warts
"features".  Whether something is a "wart" or a "feature" is a matter of
opinion; and in 40 years I've learned that opinions are like anuses; everyone
has one.

It's pointless to discuss something that
	(1) is a matter of opinion
	(2) no concensus is possible
	(3) is completely unnecessary to come to concensus.
We can all keep our opinions, and still come up with a technical spec that
works.

In your paper at Tsukuba, you defined plaintext as "text with finite state
structure".  I'm not convinced that ISO 2022 based solutions, such as your
proposal (ISO-2022-JP2 or ISO-2022-INT), are any more "finite state" than an
ISO 10646 based solution with "language tags" (I'm putting this term in quotes
deliberately, see below).

I am, however, convinced that an ISO 10646 based solution must have a
reversible mapping with ISO-2022-JP2.  It is obvious that raw Unicode is not
sufficient for this purpose.  "Language tags" is a misnomer; we actually are
not shifting languages, we are shifting through various alternate glyphs for a
particular character.  Kobayashi-san from Justsystem suggests "source ID",
although this too many have nomenclature problems.

Let's not bother with what they are called, and for the time being let's call
them "blurdybloops" instead.

So, what is the definition of a "blurdybloop"?  This is not a trivial problem.
Kobayashi-san proposed three to start (one for each of C, J, and K).  I think
that we need more than three, but I also that the total number of blurdybloops
is probably less than 25.

Do you think that you can help us with this definition?  You have considerable
talent and knowledge of East Asian language issues, and I think that you could
help enormously if you wanted to.

Would you please help?

--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)

Received on Thursday, 13 June 1996 22:15:06 UTC