RE: BOF from Masataka Ohta on 1993-08-25 (ietf-charsets@w3.org from July to September 1993)

From: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>
Date: Wed, 25 Aug 1993 13:03:02 +0900 (JST)
To: jerman-blazic@ijs.si (Borka Jerman-Blazic)
Cc: wg-char@rare.nl, ietf-charsets@INNOSOFT.COM
Message-id: <9308250403.AA12184@necom830.cc.titech.ac.jp>
> >> I went very quickly over Ohta san comments on the BOF Minutes and find out 
> >> that even he agree on the common issues to be worked out.
> 
> >Even me? Good joke. Anyway,
> 
> But you are arguing all the time about everything and I still don't have
> clear mind what you really want to happen.

I want to clarify the purpose.

It's much better to argue first on what we are working before we start
working.

> I got the impression that Harald's proposal to follow one of the existing
> streams (ISO 2022 or ISO 10 646/UNICODE)
> for support of character sets required for different languages
> was accepted as the stream to be followed. Harald presented two streams:
> 16-bits and ISO 2022 stream. The last apperaed to be too complex and more
> difficult. It was not precisely elaborated but I understood that the other
> one - i.e 16 bits was prefered by the attendees.

I'm afraid you don't understand what UTF is. YOU said:

>> - a  document defining how UCS  can  be  used  in  a  uniform  way  in
>> Internet  protocols,  especially  taking  in  consideration  the UTF-2
>> encoding  of  UCS.

UTF-2 is ASCII compatible, 8 bit oriented encoding scheme of 32 bit UCS4.

It is absolutely not 16 bit.

OK?

What someone is doing with 16 bit is to use bare, ASCII incompatible
16 bit encoding, where ASCII 'A' is, for example, represented with
two octets: NUL and 'A'.

OK?

In the BOF, I explicitely asked whether we need ASCII compatibility or not
explaining that ASCII incompatibility means we must label protocols
(existing one's and upcoming one's) whether they are using ASCII or
something else.

After that, no one said 16 bit.

OK?

The attendees agreed to develop something based on 10646. My impression
is that the attendee preferd ASCII compatible approach.

> If we can not agree on neither of them then what do you propose.
> Supporting both is maybe not the best approach.

Supporting both is the worst approach.

> >Also, with charset scheme of MIME, various non-2022 codes are defined
> >in RFC1345 (they are not currently valid MIME names, but it could be if
> >some desires so). Several other are developped in various countries.
> 
> RFC 1345 in MIME context was heavily discussed on 822ext list and final
> standpoint was issued by IESG a few months ago, so I do not want 
> to rise again the RFC 1345 and MIME issue again. Please don't ask for that.

The point here is that MIME approach is not the existing stream. Though
I don't think it so good for general purpose, it ratifies existing practices
in Japan, Korea and other countries.

> And again RFC 1345 is ISO 10 646 based!!

Could you read it?

It's character mnemonic, except for Han characters, is 10646 based.

But, it is a collection of MIME charsets (sorry Keld to have misunderstood
that they are not registered) including 2022 and EBCDIC.

> >I don't know precisely what X/Open is doing but they should be using
> >UTF2.
> 
> To my knowledge X/Open is using UTF2 and is working on its promotion.

The problem is just saying UTF2 does not have enough precision.

10646 is too vague.

> O.K. what is your proposal?

My proposal, on what we should work on, is to have a general purpose
text encoding schem based on 10646 for international infomation exchange.
With the scheme, all the languages in the world can be encoded/decoded
even if dozens of languages are mixed freely within single text. The
scheme should not have long term state or initial labeling, because it
makes the scheme just as complex as 2022.

The scheme could be better if it has several othere properties, but, I
think it should be discussed later after we agree on the general policy.

> >The problem (full bidirectionality support can not be done with finite
> >state and, thus, not plain text) is identified by me and then discussed,
> >I think. But, if you think there is other bidirectionality problems
> >identified, it's OK.
> 
> Have you seen the latest RFC draft on handling bi-directional 
> text in MIME submitted by Nussbacher?

Yes, but it has nothing to do with the current discussion.

> The services concerned were identified by Haralad already (GOPHER, FTP 
> FILE NAMES, WHOIS, WWW, DNS). The idea was the ch.sets issues (I think that
> was mentioned by Simon Spero but I am not sure) to be handled "in general"
> as security issues are handled in the emerging RFCs.

If we create something too ambiguous so that it can not be used without
protocol-wise profiling, we, indeed, need such protocol-wise discription.

> Yes, and we are working within C3 project on that problem and in the
> same time in the CEN TC 304 committee. Paper and sw will be submitted
> soon.

What is the C3 project? I'm interested in.

> I agree with you that the problem is not technically difficult. It is
> more "political", but there are people thinking about it.

> No one is expecting from you to do it yourself but you have a proposal
> don't you??

So, if we, like ISO, successfully created a chimeric monster, I will
actively agree that protocol-wise specification is necessary.

						Masataka Ohta

--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Tuesday, 24 August 1993 21:08:07 UTC