- From: Kohji SHIBANO <shibano@tiu.ac.jp>
- Date: Mon, 01 Sep 1997 02:32:50 +0900
- To: "Martin J. D(wrong string) |r(Jst" <mduerst@ifi.unizh.ch>
- Cc: ietf-charsets@INNOSOFT.COM, Harald.T.Alvestrand@uninett.no, Kohji SHIBANO <shibano@tiu.ac.jp>, Masataka Ohta <mohta@necom830.hpcl.titech.ac.jp>, jcs@tiu.ac.jp
Martin, As chariman of ISO SC2 and JIS X 0208 committee, I would like to give you so me information. At 7:46 PM 97.8.30, Martin J. D$B|r(Jst wrote: > Hello everybody, > > In the charset policy BOF at the recent IETF meeting in Munich, > chaired by Harald Alvestrand, he showed a slide with variants > of Han characters (Kanji) that are unified in Unicode/ISO 10646, > but which may be problematic. He also showed this list in his > plenary talk presenting the planned IETF charset policy. > This list has been published on page 885 (explanatory page 7), > bottom, of JIS X 0221-1995, the Japanese translation of ISO > 10646 (explanatory material not contained in the original), > and probably elsewhere. > JIS description of Unification rules found in JIS X 0221 was based on CJK-JR G standing document. CJK-JRG was the committe who do the han-unification. ISO version of the rule that is an updated version of the JIS description is available as Amendment 8 of ISO/IEC 10646-1. > In the BOF, I commented on this. I said that these were indeed > mostly character components that turned up in many characters, > and that a high percentage of them was explicitly unified by > the new version of the base Japanese Kanji standard, > JIS X 0208:1997. I mentionned a figure of something like 90 > or 95%, which turns out to be too high if one counts cases, > but probably correct if one counts the characters affected > (see below). > Since we do not have sufficient information to identify each Kanji from Chin a, Taiwan, and Korea, it is very difficult to compare 10646 unification rule s based with JIS X 0208:1997 unification rules and to evaluate compatibility between the two rules. At the time CJK-JRG did the unification, Japan also could not provides sufficient indentification information on each Kanji. I do not know the availability of some of GB standards. For example, Dr. Yas uoka of Kyoto University anvailed mystery behind GB standards. As far as I understand, CJK-JRG work only used 24 dots fonts that is not suf ficient for real unification consideration. Real consideration of unificatio n rules requires identification information and very high quality Kanji shap e information. It is obvious that a complex Kanji shape could not represented in 24 dots. During the course of JIS X 0208 revision, sometimes we use 300 dots scanned image. For example, list of variant implementation shape found in JIS X 0208 :1997 starting from 401 to 490 is based on 300 dots scanned Kanji shapes. > To this, Masataka Ohta strongly protested, saying something > to the effect that he had been on the commitee developping > that standard. I have now had time to look at JIS X 208:1997 > again. On page 399 (explanatory page 25), it lists the members > of the two commitees involved. On the following page, it gives > additional acknowledgements. Whatever that may mean, I have > not been able to find the name Masataka Ohta on these pages. > [my name turns up at the end of the text on page 400, as one > of the contributors to the public review done by the commitee, > in the form Duerst, Martin J.] > > In the case that I have missed Masataka Ohta's name somewhere > in JIS X 208:1997, I would like him to give us the exact page, > and if necessary line number, to verify. In the case he has > indeed participated, but has for some reason be forgotten, > I ask the chair of both commitees listed on page 399, Prof. > Shibano, to tell us how Masataka Ohta has been involved. > Masataka Ohta is really the member of JIS X 0208 committee and recorded as a member of ***WG2*** found in the middle of page 399 of JIS X 0208:1997, 6 li nes below my name. However, he is not officially representing JIS committee and most of his opi nions and interpretations contradics committee positions. > > > Now for the list that Harald has shown. This list has 8 lines, > with four groups that each contain 2 or three variants. > For these, I give the item number of Section 6.6.3.2 of JIS > X 208:1997 (p. 12,...) which gives examlpes of unification, > and comments if necessary. > The list is not an example but normative rules of unification. ISO/IEC 10646-1 AMD 8 only list examples. AMD 8 does not cover complete list. > Note that JIS 208 also contains and lists exceptions, but > that these are carried over to Unicode/ISO 10646 as being > separated by the source separation rule. > > > Line 1 > case 1 (3 variants) 128 (2 variants, third is > handwriting and not > covered by JIS 208) Third variant found in JIS X 0221 is not belong to the same font family. Thu s we ommitted. JIS Kanji Dictionary, which will be published in November, in cludes the shape. > case 2 (3 variants) 161 (2 variants, third is > the single-character > shape which is not listed > in JIS 208 section 6.6.3.2) Basicaly, this is an error of the first edition of JIS X 0208. This rule is basically for compatibility purpose. > case 3 (3 variants) 153 (JIS 208 lists one more variant) This rule come from well known Kanji shape design error of Kangxi dictionary. > case 4 (3 variants) 155 (2 variants, middle is > the single-character > shape which is not listed > in JIS 208 section 6.6.3.2) Separation of 61-27 from 16-91 is an error of the first edition of JIS X 0208. . . . > > With all the comments, it's difficult to exactly say what percentage > this would amount to. But counting each case as one item, it's around > 66%. If one counts characters affected, and not cases as such, however, > the percentage is much higher, because the cases with the most characters > (line 1: case 1, 2, 4; line 8: case 4) all are included in JIS 208. > So far as I understand, CJK-JRG without sufficient information on each Kanji and its shape, they did a good job. Even though they based on explanatory pa ges of JIS X 0208:1990, ISO/IEC 10646-1 has better specification of Unificat ion than JIS X 0208:1990. regards. Kohji Shibano +--Kohji SHIBANO, Professor of Systems Programming---+ | Tokyo International University, shibano@tiu.ac.jp | | Office Tel:+81-492-32-1111, -1119 (fax), | +-kshibano@mix.or.jp, Home Tel & Fax:+81-44-954-7337-+ --Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Monday, 1 September 1997 13:02:59 UTC