W3C home > Mailing lists > Public > www-international@w3.org > October to December 2004

RE: Language Identifier List up for comments

From: Peter Constable <petercon@microsoft.com>
Date: Tue, 14 Dec 2004 14:22:38 -0800
Message-ID: <F8ACB1B494D9734783AAB114D0CE68FE04856CAD@RED-MSG-52.redmond.corp.microsoft.com>
To: <www-international@w3.org>

> From: www-international-request@w3.org
[mailto:www-international-request@w3.org]
> On Behalf Of Elizabeth J. Pyatt

> Previously, the language codes have been used to encode both script
> and  language. I was assuming the characters embedded would convey
> which script is being used.

ISO 639 language IDs have never been used to identify script
distinctions. It is true that some users or implementations of RFC 1766
or RFC 3066 have sometimes used region to infer script distinctions
(e.g. zh-TW for Traditional Chinese), but this is not recommended.

It is not sufficient to supposed that characters contained in text
content can be relied upon for identifying the writing system for text:
that does not provide any means to specify in a query what writing
system is desired.



Peter Constable
Microsoft Corporation
Received on Tuesday, 14 December 2004 22:22:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:04 GMT