W3C home > Mailing lists > Public > www-international@w3.org > October to December 2004

Re: Language Identifier List up for comments

From: A. Vine <andrea.vine@Sun.COM>
Date: Tue, 14 Dec 2004 11:45:29 -0800
To: "Elizabeth J. Pyatt" <ejp10@psu.edu>
Cc: www-international@w3.org, ietf-languages@alvestrand.no
Message-id: <41BF42D9.8050606@sun.com>

Elizabeth J. Pyatt wrote:

> For written language, this is not normally an issue because the 
> phonetics are not represented. Therefore a single code of "zh" is 
> adequate. 

No, no, no, PLEASE don't use "zh" alone!  "zh" alone is so meaningless 
from both the computer and the human perspective when referring to an 
actual text!  I have kept silent up till now, with a wary eye for this.

The lone "zh" has caused us so many problems, I urge you to spread the 
word, don't use it alone unless you are sooooooooo clueless about the 
text that you are labeling, that all you know is it's some kind of 
Chinese.  And if that's the case, maybe you shouldn't be the one 
labeling the text...

At a minimum it's really helpful to know whether it's Simplified or 
Traditional, because it may affect the font chosen for rendering (take 
for example a situation where the machine config has a Traditional-only 
font as a default and the text is in Simplified.)  But beyond rendering, 
if software is trying to pick text from a language preference list, "zh" 
really messes us up.  It's much more generic than "en".  From a matching 
perspective, we tend to assume that "zh" really means "Simplified 
Chinese rendering of Mandarin as used in the PRC", but that is not the 
intention of the "zh" identifier.

Received on Tuesday, 14 December 2004 19:40:47 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:24 UTC