Re: 異体字の使用例 from 木田泰夫 on 2024-06-02 (public-i18n-japanese@w3.org from April to June 2024)

From: 木田泰夫 <kida@mac.com>
Date: Sun, 2 Jun 2024 09:50:11 +0900
To: Nat McCully <nmccully@adobe.com>
Cc: Yamamoto Taro <tyamamot@adobe.com>, Kobayashi Toshi <binn@k.email.ne.jp>, JLReq TF 日本語 <public-i18n-japanese@w3.org>
Message-Id: <10851ACC-7826-4AD7-A3BF-A80FDAE3230B@mac.com>
Hello Nat,

I have created a discussion board to capture the issue of fonts and encoding on adequately supporting Itaiji. As you mentioned, the issue seems somewhat obscure, and I must admit that I do not fully understand it. I would appreciate it if you could help clarify the issue and make it more obvious to us.

https://github.com/w3c/jlreq-d/discussions/58


- kida

> 2024/06/02 9:33、木田泰夫 <kida@mac.com>のメール:
> 
> Hello Nat,
> 
> The scope of jlreq-d is text layout and enablement. The actual content of the text or its artistic expression is out of this scope. When I or others mention that we would not cover certain topics in jlreq-d, it does not imply that those topics are unimportant. In fact, as I mentioned earlier, I agree that the capability of supporting various Itaiji is an important aspect of the power and value of digital text.
> 
> I believe most Itaiji issues fall within the area of content, except for the issue that Bin-sensei mentioned, which is likely on the borderline. If we were to cover it, we would need to explain Itaiji, albeit very briefly.
> 
> You mentioned that fonts are currently a mess and that Unicode is inadequate. Could you please elaborate on these points? I would like to understand.
> 
> - kida
> 
>> 2024/06/02 1:07、Nat McCully <nmccully@adobe.com>のメール:
>> 
>> I actually disagree here with limiting our documentation to what is possible only with plain text encoding. The reality is that today to support Japanese text composition fully and to respect the specific character/glyph choices of the author faithfully we must include the tangled and confused world of fonts. Fonts are, today anyway, a mess. Unicode is inadequate. These considerations are largely unknown to most, and thus bad decisions continue to be made by standards bodies and by implementers because of such obscurity. 
>> I also think that although a character is a character is a character to some (my interpretation of Kida-san’s stance), I believe for implementers we must empower authors and users to freely set the exact character design if they wish, and when it is not encoded or the font they wish to use only partially implemented or made so that only by using gid can they get what they want, we support them. I think old character forms are material to the content and important to preserve. Even when rendering digitally. In fact, allowing everyone to choose the original form of old or the new form of standardized writing is part of the power and value of digital content. 
>> 
>> —Nat
>> From: Taro Yamamoto <tyamamot@adobe.com>
>> Sent: Saturday, June 1, 2024 8:13:37 AM
>> To: 木田泰夫 <kida@mac.com>; Kobayashi Toshi <binn@k.email.ne.jp>
>> Cc: JLReq TF 日本語 <public-i18n-japanese@w3.org>
>> Subject: Re: 異体字の使用例
>>  
>> 異体字について
>> 少しコメントさせてください。
>> 
>> 基本的にUnicodeで符号化されているデジタルテキストを作成することに限定して書いた方が良いと思います。
>> それは、異体字にどのような種別があるにせよ、符号化されれば文字（Character）なのであり、独立した符号位置が与えられない異体字は、文字としては存在しえず、そのグリフに文字コードから指定する方法はIVSを用いる以外にはありません。また複数の異体字のそれぞれに独立した符号位置が与えられた場合では、基底文字だけを用いてその文字を指定する場合、どちらも独立した文字として指定されるわけです。であれば、異体字そのものについて細かく細分化して説明する必要はなく、「文字コードによっては区別できない異体字にはIVSを用いてアクセスできる場合がある」、そして「IVSによっても区別できない字形差は通常、デザイン差とみなされ、その異体字のデザインをもつフォントを選択する必要がある」という程度に記述をとどめるのが良いのではないでしょうか。
>> 
>> 私見まで。
>> 
>> 山本太郎
>> 
>> 差出人: 木田泰夫 <kida@mac.com>
>> 送信日時: 2024年6月1日 21:57
>> 宛先: Kobayashi Toshi <binn@k.email.ne.jp>
>> CC: JLReq TF 日本語 <public-i18n-japanese@w3.org>
>> 件名: Re: 異体字の使用例
>>  
>> EXTERNAL: Use caution when clicking on links or opening attachments.
>> 
>> 
>> 敏先生、
>> 
>> なるほど。おっしゃる通り、例えばjlreq-dの表記をJIS X 0213:2004や表外漢字字体表に合わせると決めた場合、繫 vs 繋のようにコードポイントが異なるゆえに注意すべき漢字がありますね。
>> 
>> JIS2004で字形の変わった文字のほとんどは同一コードポイントで、そちらはプラットフォームや見る人のフォントの設定で変わってしまうので、文書側でできることは中途半端でしかないんですが。とは言えほぼ全てのプラットフォームは今やJIS2004でしょうから、JIS2004に合わせるのが良いんでしょうかね。これは後処理で一気に変えられるので、jlreq-dをどうするかはおいおい考えれば良いと思います。
>> 
>> それとは別に、この問題をjlreq-dで説明するかどうかを決める必要がありますね。入れた方がいいのかな。私はいい加減なので、そんな統一にこだわらなくても、一つの文書の中で字形がバラバラでも、そんなに気にする必要もないのでは、と思ってしまいます。が、一応知識としては持ってもらう、んですかね。それを説明するとすると、異体字とはなんぞや、を説明する必要が出てきますね。
>> 
>> 木田
>> 
>> > 2024/06/01 19:45、Kobayashi Toshi <binn@k.email.ne.jp>のメール:
>> >
>> > 木田 様
>> > みなさま
>> >
>> > 　小林　敏　です．
>> >
>> >  Kobayashi Toshi　wrote
>> >
>> >> そして，常用漢字としての問題，表外漢字の字体の扱いという点では，印刷に限られた問題ではないと思います．
>> >>
>> >> 以下のような例は，前の字体にするか，後ろの字体を使用するか，Webでも両方を見かけますので，Webで字体が複数ある例は結構あるように思います．
>> >> 　常用漢字の例　曾・曽　瘦・痩　麵・麺　頰・頬　塡・填　葛・葛
>> >> 　表外漢字の例　摑・掴　噓・嘘 繫・繋 藪・薮 鶯・鴬 壺・壷 攪・撹 賤・賎 諫・諌 頸・頚 嚙・噛　瀆・涜
>> >
>> > このことを別に表現するなら，以下のようになります．
>> >
>> > “常用漢字表”に従って（字種，字体，音訓）表現するとした場合，2010年までは，固有名詞を除外し，字体（字体を選択しないといけない）の“問題はない”といってもよいでしょう．（デザイン差という問題は残りますが，問題となるようなことは実際にもなかったと思います．）
>> >
>> > しかし，2010年の“常用漢字表”の改正以降は，字種が“常用漢字表”の範囲内であっても，どの字体を選択するかの問題は，紙版であろうが，Webであろうが，例として上に掲げた漢字を使う場合，どうするかの問題が出てくるということです．
>> >
>> > ましてや表外漢字を使用する場合は，2010年以前から，該当する漢字を使用すれば，字体を選択しないといけない問題は出てきたのです．ただ，書籍の場合は，過去のいきさつがあり，そうした問題を認識していたので，ことさら問題になった，ということでしょう（当然，問題としない出版社もあった）．デジタルテキストの世界では，あまり，そのことを問題にする人がいなかったということだと思います．
>> >
>> > 例えば，木田さんの書かれた“2. 日本語デジタルテキストの作り方”（以前のバージョン）に，“繋げて”という文字が使用されていた．最終的に公開する際には，たぶん，“つなげて”か，“繫げて”の方がいいんじゃない，といったかもしれない，ということです．
>> >
>> > つまり，“常用漢字表”に含まれている漢字あるいは，表外漢字を使用する場合，使用する漢字によっては，字体の選択が必要となる漢字が出てくるよ，ということです．
>> >
>> > なお，私が書くなら，上に掲げた例でいえば，前に掲げた字体を使用すると思いますが，後ろの字体を使用することは否定はしません．それは，様々な事情はあるので，それはそれということかと思います（“常用漢字表”でも，言い方は別ですが，そのように言っています）．
>> >
>
Received on Sunday, 2 June 2024 00:50:32 UTC