RE: Thoughts about characters transmission

> Masataka Ohta writes:
> 
> > For JIS, for example, Hirakana, Katakana and some frequently used
> > punctuations, at least, and some frequently used Japanese Hans (about
> > 1000, at most), optionaly, should be encoded with two octets.
> 
> Is there an easy criterium to distinguish about 1000 characters
> (preferably based on their code point), or do you have to use usage
> statistics?

There is a list of Han characters to be educated in each grade of the
elementary schools in Japan compiled by the Ministry of Education.

	grade	# of characters		cumulative percentage of use
	1	80			21
	2	160			43
	3	200			61
	4	200			73
	5	185			84
	6	181			89

The cumulative percentage is my private measurement on newspaper
articles.

I think other Han using countries should also have such lists.

						Masataka Ohta

--Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)

Received on Wednesday, 28 July 1993 20:47:17 UTC