- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Thu, 9 Jan 1997 15:07:57 +0100 (MET)
- To: Bert Bos <bert@www10.w3.org>
- cc: www-international@www10.w3.org
On Wed, 8 Jan 1997, Bert Bos wrote: > RFC 2070 (html-i18n) says that the LANG attribute is only for natural > languages, not for computer languages, but recently I've started > wondering why. It's taken from RFC 1766. Loosening that restriction might open a can of worms. There is not even that much of experience with the current language tags yet. > It may happen in a text that there is a word or phrase that is not in > any human language, such as the name of somebody, or some code. Names are is some language. For high-quality rendering of Han names, it's good to know whether it's Chinese or Japanese or Korean or Vietnamese. For text-to-speach rendering, it's also important. Take your full first name as an example :-). If you don't tag that as Dutch, the pronounciation will be very far from the real one. > HTML has some mark-up for the computer code: it can be put inside > <CODE>, but there is no element for the name of a person. > > Maybe LANG should be extended to cover > > - computer languages (Pascal, C, HTML, CSS,...) > - proper names (language "none"?) > - "unknown" and "any" languages > > The last two would be useful, resp., for a text that is in some > language, but the author doesn't know which, and for a text that is the > same in every language. An example would be the SI units mm, s, etc. The last one might be useful, but I am not really convinced. For units, there may be language-dependent rendering conventions. The example of Hebrew or Arabic is not very relevant, because in these cases, the script is decisive. The question would be whether "mm" in Hebrew letters can be thought, without bad consequences, as Hebrew or Yiddish or so, or whether it is necessary to tag it as something neutral. Same for Arabic and the many languages it is used with. In that case, I guess at least for Urdu, it would be important to tag it as Urdu so that a "falling" font style is choosen. Regards, Martin.
Received on Thursday, 9 January 1997 09:07:52 UTC