Re: Proposals for 10646/Unicode in MIME from Masataka Ohta on 1993-12-21 (ietf-charsets@w3.org from October to December 1993)

From: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>
Date: Tue, 21 Dec 1993 21:04:16 +0900 (JST)
To: jerman-blazic@ijs.si (Borka Jerman-Blazic)
Cc: ietf-charsets@INNOSOFT.COM, dcrocker@mordor.stanford.edu, David_Goldsmith@taligent.com
Message-id: <9312211204.AA02201@necom830.cc.titech.ac.jp>
> >> I note here that Masataka's proposal for ISO-2022-JP-2 demonstrates what
> >> we've been arguing all along: it is not enough to just have a character
> >> encoding.
> 
> Yes!

Completely wrong. We don't need "character" encoding at all.

> >Recently I avoid to use the word "character" as much as possible and
> >use the phrase "text encoding", because the concept of "character"
> >beyond ASCII can not be well defined. Various units of text encoding 
> >are necessary for different purposes.
> 
> As long as you speak and write about ISO 2022 and UCS you have to speak
> about characters and character sets  to avoid the mess!

ISO 2022 and UCS are the mess!

So, as long as we speak and write about ISO 2022 and UCS, we can't
avoid the mess. :-)

> What is character
> is well defined in these documents and that is why you have to keep the
> meaning of the terminology clear. 

The definitions are like political speeches and scientifically useless.

> >It is and its successors will be as stateless as practically possible
> >with ISO 2022.
> 
> ISO 2022 has no and will not have successors.

While ISO 10646 is the much messier successor of ISO 2022, successor
of ISO-2022-JP-2 shall be ISO-2022-INT-1.

> What you have are just derivative
> which are not legal if ISO 2022 is considered or followed (i.e the use of G0).

I have no interest in being legal with ISO's definition of "legal".

> >That is, at the beginning of a line, the state can be assumed to be unique.
> 
> Not always!

We are talking about ISO-2022-JP-2 and its successors, not ISO 2022.

   Applications such as pagers and editors which randomly seek within a
   text file encoded with "ISO-2022-JP-2" may assume that all the lines
   begin with ASCII, not with JIS X 0201-Roman.

> Yes, of course!

> Not at all!

> Yes !

Your responses here are good examples of untechnical reaction, which
should be avoided if you are serious about technically meaningful
internationalization.

> >> letting local conventions set the default language.

> UNICODE was developed for internationalised environment and is implemented
> in internasionalised products!

As the product assumes local conventions, it failed to be
internationalized.

It occurs too often, especially in ISO, that the final/intermediate
products do not satisfy the initial intention.

> >Then, you can see nothing.
> 
> Could you be please more polite in your mailings.

Not being so polite is a good protection against being swallowed in ISOish
untechnical debates on nothing.

> >ISO-2022-JP-2 is produced from long and extensive

> >Next, it is 7 bit.
> 
> This is just japanase derivative which is using ISO 2022 extension technique

We are talking about ISO-2022-JP-2, whose to-be-RFC I posted just recently.

Right?

OK?

Are you sure?

> UNICODE is NOT future, what is then future?

ISO 2022 is better, at least.

> >We do know that having two or more uninteroperable encodings such
> >as EUS and SJIS or ASCII and 16bit-UNICODE is the real pain.
> 
> Why?

Because you must know file types then, the nightmare in the era of
mainframes.

> >> A specious argument at best, since the rest
> >> of the world does need special software to view ISO-2022-JP-2 anyway.

> Exactly!

I'm afraid that you have been working on some special software to display
some European characters.

> >On the other hand, both ISO 2022 and ISO 10646/UNICODE lacks a unified
> >semantics to mix multilingual characters in the world. ISO 10646/UNICODE
> >inherits the policy of ISO 2022 to treat characters in different languages
> >differently. Thus, it is impossible to write a unified text processing
> >library or application of meaningfully rich functionality.
> 
> This is not true!

See the correspondence between ISO 2022 and ISO 10646/UNICODE on:

	Latin-1

	Thai

	Devanagari

	Hangul

Why, unlike Thai and Devanagari, all the modern Hanguls are precomposed?

Why a special concept "conjoining character" was introduced at the
last stage of the standardization process only for (ancient) Hanguls?

> >Thus, for the time being, our solution must be 7 bit ISO 2022.
> 
> To whom this "MUST BE 7 bit ISO 2022" apply?

To everybody in the world.

> ICODE was rejected by the IETF BOF on UCS in Amsterdam! You can read the
> minutes in the Proceedings and find out why.

Laugh.

Though I haven't read the proceedings at all, do you think any minutes
not approved by participants are meaningful? I thought ISOish
people were a little more clever in dealing with politics.

Instead, in the BOF, 16 bit UNICODE was mostly rejected.

UTF (not necessarily UTF2) encoding was chosen, instead.

Anyway, the BOF is not the place to make final decision.

You should have spent your time not to forge the faked proceedings.

> >I have *ABSOLUTELY* *NO* interest in text/enriched from the beginning.
> 
> You are interested just in one thing, how to make difference of C/J/K
> character sets in using UNICODE. Such restricted interest can not lead
> to an international solution. All attempts in that direction will result in 
> a local solution to a restricted region as is  2022 JP.

Apparently, you don't understand what and how ICODE/IUTF is.

Now, it is understandable that you said:

	ICODE was rejected by the IETF BOF on UCS in Amsterdam!

Unlike ISO 2022 and ISO 10646/UNICODE, ICODE/IUT is NOT a mere
collection of mutually inconsistent local solutions. Understand
my paper, if you can understand anything at all.

> >You can't force us give up plain text.
> 
> No one can do it!

Good. Don't forget that MIME "charset", for example, is mainly for
plain text.

						Masataka Ohta

--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Tuesday, 21 December 1993 04:10:18 UTC