- From: Mark Davis ⌛ <mark@macchiato.com>
- Date: Tue, 1 Sep 2009 14:54:05 -0700
- To: Richard Ishida <ishida@w3.org>
- Cc: public-i18n-core@w3.org
- Message-ID: <30b660a20909011454k7c16f2bbpd89995ea2d75b531@mail.gmail.com>
*A quick pass. > If you want to know how to create a language subtag, you should read Choosing a language tag<http://www.w3.org/International/articles/language-tags/qa-choosing-language-tags>. This article provides an overview of the syntax for language tags as described in BCP 47. link fails. *> Note that the HTML specification still recommends<http://www.w3.org/TR/REC-html40/struct/dirlang.html#h-8.1.1>the use Add version; although still being worked on, a lot of people are implementing HTML5. > language-extlang-script-region-variant-extension-privateuse variants-extensions In the tool, you point to, http://people.w3.org/rishida/utils/subtags/index.php?find=dimli&submit=Find&lookup=&list=0&check=you say: > "[image: Note.] zza is a macrolanguage. You should consider whether you can find a more specific language subtag for your purposes. This macrolanguage encompasses diq kiu <http://people.w3.org/rishida/utils/subtags/index.php?lookup=diq%20kiu%20&submit=Look+up> ." If you have that warning, you need to have the other warning on diq (arb, ...). That is: > "[image: Note.] diq is encompassed by the macro language 'zza'. You should consider whether you can the more general language subtag for your purposes." >The language tag can be used on its own, but unless there is some convention about its meaning in the context where it is used, it is not necessarily precise enough. For example, zh means Chinese, but it covers many Chinese dialects, often mutually incomprehensible. It is only where a convention is applied that zh or zh-CN can be considered to represent the Mandarin form of Chinese. => The macrolanguage tag can be used on its own, but note that it may not be sufficiently precise in some environments. In some circumstances you will want to use a more precise code. For example, zh means Chinese, and in theory it covers many Chinese dialects, often mutually incomprehensible. In practice, most implementations will interpret it as simply the predominant form: Mandarin. If you are using "zh" to represent a language which is *not*Mandarin, such as Hakka Chinese, you are better off using the explicit code "hak". > As RFC 4646 co-author, Addison Phillips, writes, "For virtually any content that does not use a script tag today, it remains the best practice not to use one in the future". I disagree with that. The better advice is You should not use a script code if the predominant usage of the language is with a single script, and you don't need a contrast to remove ambiguity. For example, either Latin or Cyrllic are appropriate for use with uz, because of 'a'. As another example, where audio and written content need to be distinguished, one can use the "en-Latn" for written content and "en-Zxxx" for audio content. On Tue, Sep 1, 2009 at 12:53, Richard Ishida <ishida@w3.org> wrote: > Chaps, > > I've been working on a new version that reflects the changes in RFC 5646. > > http://www.w3.org/International/articles/language-tags/temp.php > > Please take a look and let me know if you have any comments so far. > > Addison, we should probably discuss this on Wednesday, and any comments on > the choosing language tags article too. > > Thanks, > RI > > > ============ > Richard Ishida > Internationalization Lead > W3C (World Wide Web Consortium) > > http://www.w3.org/International/ > http://rishida.net/ > > > > > > >
Received on Tuesday, 1 September 2009 21:54:43 UTC