BOF minutes from Borka Jerman-Blazic on 1993-08-10 (ietf-charsets@w3.org from July to September 1993)

From: Borka Jerman-Blazic <jerman-blazic@ijs.si>
Date: Tue, 10 Aug 1993 22:13:46 +0200
To: ietf-charsets <ietf-charsets@INNOSOFT.COM>
Message-id: <16*/S=jerman-blazic/O=ijs/PRMD=ac/ADMD=mail/C=si/@MHS>
The BOF minutes  were mailed to Erik and to RARE WG-CHAR list. I am away
from my office more then three weeks and I have read some of my mail
today. I noticed some arguing about the BOF summary and that is why I am
resending the minutes to this list.

Regards,

Borka

==================

From <S=jerman-blazic;O=ijs;P=ac;A=mail;C=si>  Wed Oct 14 14:50:58 1998
Delivery-date: Saturday, July 17, 1993 at 13:11 GMT+0200
From:Borka Jerman-Blazic <S=jerman-blazic;O=ijs;P=ac;A=mail;C=si>
To: <S=huizer;G=erik;O=surfnet;P=surf;A=400net;C=nl>
Message-ID:inbox:22
Subject:BOF Minutes

Please find enclosed the BOF Minutes,

Cheers,

Borka




                                                             17-Jul-93




Minutes of the UCS BOF



The BOF took place at RAI, Amserdam, on 27th IETF, on July  14,  16:00
to 18:00


The  BOF was chaired by Borka Jerman-Blazic.  The list of attendees will be
included.

Introductory brief  tutorial was given by  Borka  Jerman-Blazic.   She
pointed out to some of the problems which appear on the network due to 
the lack of support of the national character sets used for input/output
/processing/displaying the text written in  languages used all over the
world.   She   stressed  the  need  for   proper  maintenance  of  the
character integrity over the network. The requirement for processing and
interchanging different character sets correclty is  especially  rele-
vant  for  some  internet  services  dealing  with names of persons or
organizations.

Peter Svanberg gave  short  overview  of  the  level  of  support  for
non-ASCII character sets in different Internet protocols.  Some of the
protocols were identified as hostile to 8 bits characters.  Among them
DNS,  SMTP, FTP, NNTP, WAIS, MIME Text/Enhanced, NFS, AFS, Whois, URN,
Gopher etc.  The more recently developed protocols such as MIME part 1
and part 2 as well as some currently on-going projects such as Whois++
as was mentioned by  Simon  Spero  support  16  bits  coding  and  the
repertoires  provided  by such coding.  He mentioned too, that several
IETF groups developing new protocols/services consider the  importance
of the proper support of the character sets problem.

The next speaker was Mr.Masatak Ohta.  He presented his view regarding
the idea the International Universal  Coding  system to be  recommended
for use over the Internet. He identifyed 5 properties which are required
to be present in the recommended coding system.  These are:

Identity for encoding and  decoding  which  he  understand  as  unique
mapping  between  particular   graphic  character  and  its  code (bit
combination),

Causality understanded as independence of a processed coded  character
from the other incomming characters in the data stream,

Finite  State Recognition,  state dependence of the code  required for
presentation/display of multi-octed coded data,

Finite resynchronizability which means that the state  of  automation
can be determined uniquely by reading fixed finite number of octets,

Equality,  requirement  that  a  character coded with different coding
system can be always recognized as the same character.

Mr Ohta  looked for the required properties in ISO 10 646 and find out
that  the  Causality and Finite resynchronizability are not satisfied.
Equality is not yet worked out.   He  proposed  an  extension  to  the
existing  UCS  code  system consisting of 5 additional bits which will
enable  the deficiency of the UCS coding system to be overcomed.   The
discussion  showed  that  the  proposed solution is not in the general
stream of the development of the  standard  character  set  codes  and
their  applications  in  the  computing  systems.  One of the possible
solutions to the problems identified by Mr.Ohta could be  the  use  of
the  whole model of UCS i.e the 4 envisaged octets which define besides
the cell and row position for a character in  the  Multilingual  Basic
Plane  of ISO 10 646 additional planes and groups.  There was proposal
the required 5 additional bits to be coded as a private plane  in  the
UCS  scheme.   John  Klensin noted that such approach could clash with
the reassignment of a such plane in the futher standardization process
of  ISO  JTC1/SC2.   In  the  discussion  the  problem  of handling of
bidirectional text was also identified.

Harald Alvestrand pointed out that what is happening now is a sort  of
transition  period between 8-bit coding and 16-bit coding provided with
UCS.  Other parralel stream for support of different national  charac-
ter  sets  is  the "character switching" which is enabled by use of the
code extension technique of ISO 2022.  It was obvious that this scheme
is  not of practical use for Internet except for special cases i.e the
Japanese e-mail solution.

The BOF then discussed the possible working items if  IESG approve the
formation  of  a  working  group.  The chair identifyed several papers
which are Internet drafts dealing with  the  character  sets  problems
such  as:  RFC  1345,  "X400 use of the international character sets",
"Character Sets  and  Languages".   Other  items  were  discussed  and
proposed  by  the   BOF  attendees.   They are summarized below.  John
Klensin pointed out that special precautions has to be  taken  in  the
recommendation  of  UTF 2 as data interchange method over the Internet
in connection with the  possible  assignements  of  additional  coding
planes  by  JTC1  SC2.   He  also  recommended the use of mailing lists
already working within IETF.  They  are:  <ietf-charsets@innosoft.com>
and two others working on mailing issues (822ext and 821).

As a summary the BOF decided to propose to IESG to consider the possi-
bility of setting up of a working group to work on the following work-
ing items:

- a  document defining how UCS  can  be  used  in  a  uniform  way  in
Internet  protocols,  especially  taking  in  consideration  the UTF-2
encoding  of  UCS.   The  document  will  provide  guidance  to  other
protocols which have to deal with these items over the  Internet,

-a document identifying  the languages and the characters required for
coding text written in  particular  natural   language  (a  sort  of
guidelines   for  services dealing with multilinguality such as NIR 
service based on usage of plein text),

-a document defining a tool for coded character sets conversion  to  be
provided  within  some  services  such  as e-mail user agent including
fall-back representation of incoming characters that are  outside  the
supported character repertoire of the receiver,

-a  proposal  for  extending  the  mandatory  issues  which have to be
covered in the RFC standardization process to  include  character  set
consideration/support.




--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Tuesday, 10 August 1993 13:14:25 UTC