- From: Richard Ishida <ishida@w3.org>
- Date: Wed, 7 Apr 2004 19:14:41 +0100
- To: <aphillips@webmethods.com>, <www-international@w3.org>
> From: Addison Phillips [wM] [mailto:aphillips@webmethods.com] > > This debate about where to position script tags took place > some months ago on the ietf-languages list. You might want to > review the archives of that list. I'd much prefer to get a summary of the conclusions, for which many thanks ;-) > > Basically the subtags of a language tag modify the base > language and not each other (insofar as that is possible). It > is the case that they form an implied hierarchy, so > 'zh-Hant-TW' is more specific than 'zh-Hant', although > Rfc3066bis, like RFCs 3066 and 1766 before it, make clear > that this hierarchy may not have intrinsic meaning (that the > 'next step up' may not be mutually intelligible). > > So the basic question about where to put the script tag in > the language tag revolved around whether script or region was > more general as a classification. That is, are "texts written > in Traditional Chinese" a superset of "texts written in/for > Taiwan", or the other way around? Hmm. I tend to see it as somewhat orthogonal, which is why in the format I proposed way back I separated it out as, for example, "zh-TW/Hant" - ie. "<lang+dialect>/<script>" - which btw would also allow you to say "/Hant" (ie. 'I know it uses the Traditional Chinese script, but I don't know what language') as well as match easily with existing zh-TW. This also has the merit of not claiming that script is an aspect of language. > > Ultimately, language tags and their limited hierarchy are an > approximation of language. They cannot possibly capture all > of the nuance implied by something as organic as human > language. But the question is whether they can capture the > details closely enough to suit the vast majority of applications. > > In other words, the addition of script codes allows us to > identify languages in the corner case where writing system > variations enter into the equation, long a problem in > identifying Chinese amoung other languages. It was clear > early on that in order to have deterministic parsing, the > position and length of each subtag must be fixed. Furthermore > we didn't want to create a whole messy nimbus of tags with > very very similar semantics ('zh-TW-Hant' vs. 'zh-Hant-TW') > and no way to choose between them. > > RFC3066bis adds additional slots (two of them) to the > existing ontology, but does so in the most conservative way possible. > > Best Regards, > > Addison > > Addison P. Phillips > Director, Globalization Architecture > webMethods | Delivering Global Business Visibility > http://www.webMethods.com Chair, W3C Internationalization > (I18N) Working Group Chair, W3C-I18N-WG, Web Services Task > Force http://www.w3.org/International > > Internationalization is an architecture. > It is not a feature. > > > -----Original Message----- > > From: www-international-request@w3.org > > [mailto:www-international-request@w3.org]On Behalf Of Richard Ishida > > Sent: mercredi 7 avril 2004 07:48 > > To: www-international@w3.org > > Subject: Traditional Chinese in RFC3066 bis > > > > > > > > Tex raised a question on IRC about how to represent Traditional > > Chinese in RFC3066 bis[1]: > > > > "I thought zh-TW would become zh-hant-TW, not just zh-hant" > > > > My understanding is that if you want to say just > Traditional Chinese > > (without specifying which specific language you are > > representing) you would continue to use zh-Hant. This > would probably > > apply for most web pages or translations. > > > > The meaning of zh-Hant-TW is not clear to me. Does it mean a > > Taiwanese version of the script (as opposed, say, to a Hong Kong > > version of Traditional Chinese which may include different > > characters); or does it mean the language of Chinese as spoken in > > Taiwan and written with Traditional Chinese characters? If so, how > > would that differ from zh-TW? > > > > I could imagine the latter scenario being represented more > intuitively > > as zh-TW-Hant, although I think that is not legal according to RFC > > 3066 bis. > > > > Addison, Mark, help ! > > > > RI > > > > > > ============ > > Richard Ishida > > W3C > > > > contact info: > > http://www.w3.org/People/Ishida/ > > > > W3C Internationalization: > > http://www.w3.org/International/ > > > > >
Received on Wednesday, 7 April 2004 14:14:39 UTC