- From: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>
- Date: Thu, 29 Jul 1993 12:43:14 +0900 (JST)
- To: lwj@cs.kun.nl (Luc Rooijakkers)
- Cc: ietf-charsets@INNOSOFT.COM
> Masataka Ohta writes: > > > For JIS, for example, Hirakana, Katakana and some frequently used > > punctuations, at least, and some frequently used Japanese Hans (about > > 1000, at most), optionaly, should be encoded with two octets. > > Is there an easy criterium to distinguish about 1000 characters > (preferably based on their code point), or do you have to use usage > statistics? There is a list of Han characters to be educated in each grade of the elementary schools in Japan compiled by the Ministry of Education. grade # of characters cumulative percentage of use 1 80 21 2 160 43 3 200 61 4 200 73 5 185 84 6 181 89 The cumulative percentage is my private measurement on newspaper articles. I think other Han using countries should also have such lists. Masataka Ohta --Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Wednesday, 28 July 1993 20:47:17 UTC