- From: Misha Wolf <misha.wolf@reuters.com>
- Date: Tue, 25 Mar 1997 16:12:18 +0000 (GMT)
- To: meta2 <meta2@mrrl.lut.ac.uk>, www-international <www-international@w3.org>, Unicode Discussion <unicode@unicode.org>
Renato Iannella wrote (privately): >Hi Misha, at DC4 you mentioned a standard >that listed the different types of languages >such as: > en-uk > en-us > ... > >Which one was it? My reply may be of interest to others. The relevant standards are: ISO 639 - Codes for the representation of names of languages ISO 3166 - Codes for the representation of names of countries RFC 1766 - Tags for the Identification of Languages RFC 2070 - Internationalization of the Hypertext Markup Language Some brief notes: 1. ISO 639 allows the use of ISO 3166 codes to qualify ISO 639 codes. 2. RFC 1766 defines a more general structure, of which ISO 639 and ISO 3166 are parts. 3. RFC 1766 defines the linking character to be a "-", as in "en-us". 4. RFC 1766 defines a registry for additional sub-tags. Two have been registered to date: "no-nyn" and "no-bok". 5. RFC 2070 redefines the interpretation of RFC 1766 tags, making the hierarchy meaningful (rather than just a mechanism for tag construction): --- start of quote from RFC 2070 ----------------------------------- In the context of HTML, a language tag is not to be interpreted as a single token, as per RFC 1766, but as a hierarchy. For example, a user agent that adjusts rendering according to language should consider that it has a match when a language tag in a style sheet entry matches the initial portion of the language tag of an element. An exact match should be preferred. This interpretation allows an element marked up as, for instance, "en-US" to trigger styles corresponding to, in order of preference, US-English ("en-US") or 'plain' or 'international' English ("en"). --- end of quote from RFC 2070 ------------------------------------- 6. ISO 639 was last published in 1988. A few changes were made in 1989: --- start of quote from RFC 1766 ----------------------------------- The following codes have been added in 1989 (nothing later): ug (Uigur), iu (Inuktitut, also called Eskimo), za (Zhuang), he (Hebrew, replacing iw), yi (Yiddish, replacing ji), and id (Indonesian, replacing in). --- end of quote from RFC 1766 ----------------------------------- 7. As some people seem very keen on the US LoC-style 3-char language tags, someone could decide to issue an RFC updating RFC 1766 to include support for an updated ISO 639, incorporating these 3-char tags. [I believe that voting on such a change to ISO 639 is currently in progress.] This idea was aired at Canberra. My mentioning it here should not be taken as an expression of support, but rather as noting a possible way to reconcile these two schemes. 8. Do read the two RFCs mentioned above. >Cheers... Renato >. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . >Dr Renato Iannella http://www.dstc.edu.au/RDU/staff/ri/ >DSTC Pty Ltd phone://61/7-3365-4310 >Gehrmann Labs, QLD, 4067, AUSTRALIA fax://61/7-3365-4311 >. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . >Australian WWW Technical Conference '97 -> http://www.dstc.edu.au/aw3tc Misha
Received on Tuesday, 25 March 1997 11:22:29 UTC