RE: Language Identifier List up for comments

> From:
> On Behalf Of Elizabeth J. Pyatt

> Previously, the language codes have been used to encode both script
> and  language. I was assuming the characters embedded would convey
> which script is being used.

ISO 639 language IDs have never been used to identify script
distinctions. It is true that some users or implementations of RFC 1766
or RFC 3066 have sometimes used region to infer script distinctions
(e.g. zh-TW for Traditional Chinese), but this is not recommended.

It is not sufficient to supposed that characters contained in text
content can be relied upon for identifying the writing system for text:
that does not provide any means to specify in a query what writing
system is desired.

Peter Constable
Microsoft Corporation

Received on Tuesday, 14 December 2004 22:22:52 UTC