W3C home > Mailing lists > Public > www-international@w3.org > October to December 2004

Re: Language Identifier List up for comments

From: Mark Davis <mark.davis@jtcsv.com>
Date: Thu, 16 Dec 2004 07:45:08 -0800
Message-ID: <01c601c4e386$3b33f480$edd0399d@sanjose.ibm.com>
To: "Richard Ishida" <ishida@w3.org>, "'Tex Texin'" <tex@xencraft.com>, <www-international@w3.org>
Cc: "'IETF Languages'" <ietf-languages@iana.org>, <www-international@w3.org>

> Then there's the question: what are we doing with this page?  Describing
> current usage or recommending best practises.  If the latter, perhaps
zh-CN
> and zh-TW should only appear on the page if clearly marked as edge cases.

John's description was "plausible". That is a much broader criterion than
either current usage or best practices. On that account, all of the
following qualify, and should be included.

> zh-CN, zh-HK, zh-MO, zh-SG, zh-TW,
> zh-Hans, zh-Hans-CN, zh-Hans-SG,
> zh-Hant, zh-Hant-HK, zh-Hant-MO, zh-Hant-TW

‚ÄéMark

----- Original Message ----- 
From: "Richard Ishida" <ishida@w3.org>
To: "'Tex Texin'" <tex@xencraft.com>; <www-international@w3.org>
Cc: "'IETF Languages'" <ietf-languages@iana.org>; <www-international@w3.org>
Sent: Thursday, December 16, 2004 07:14
Subject: RE: Language Identifier List up for comments


>
>
> > Since there are only two tags for CN, zh-CN and zh-hans-CN,
> > would those who argue for not overdifferentiating tags,
> > recommend just the simpler zh-CN?
> > Similarly for TW, just zh-TW?
>
> What does zh-CN mean?
>
> It is most commonly used as far as I'm aware to indicate text written in
the
> Simplified Chinese script.  For identification of the script I think we
> should recommend zh-Hans first these days - although we need to add
caveats
> about the fact that some applications won't recognise it (eg. for
automatic
> application of fonts in Unicode encoded Web pages on some browsers (see
> http://www.w3.org/International/tests/results/lang-and-cjk-font). There
are
> not a huge number of applications, as far as I'm aware.)
>
> Use of zh-CN doesn't seem to make sense for identifying spoken Chinese,
> since there are many dialects in China.  I think one should recommend
> zh-guoyu, zh-yue, etc. for this purpose.
>
> Note also that Mandarin, Cantonese, Hakka, etc are spoken in many parts of
> the world.  My expectation is that the use of CN would only be appropriate
> if one wanted to explicitly make the point that one was referring to the
> language as spoken in Mainland China - ie. that there is some particular
> characteristic of the instance of text or audio recording that was
> idiosyncratic to that particular area as a whole.
>
> And now what does zh-TW mean?  Well usually text written in Traditional
> Chinese script, although the we could repeat much of what I wrote above
> about zh-CN for this too.  zh-TW taken literally means the Chinese spoken
in
> Taiwan - which happens to be Mandarin.  So unless you have particular
> distinguishing features in mind, perhaps, again you should just use
> zh-guoyu.
>
> Then there's the question: what are we doing with this page?  Describing
> current usage or recommending best practises.  If the latter, perhaps
zh-CN
> and zh-TW should only appear on the page if clearly marked as edge cases.
>
>
>
> Btw, what does de-CH represent in the table?  Swiss German is different
from
> de-DE, and rarely written, and then has little consistency to its
> orthography.  There are also many local variants to Swiss German across
> Switzerland, which would seem to invite a large number of additions to
this
> table.  But presumably de-CH refers to the way de-DE German is written in
> Switzerland or spoken by newsreaders there (and there are a small number
of
> significant differences here from de-DE.)?  If so, we ought to clarify
that
> in the table.
>
> I think this kind of process could be applied to many other parts of the
> second table, which worries me.  I can't help thinking that it might be
> better to talk through some examples of when to use en and when to use
en-GB
> or en-US, talk through the choices for particular problem areas like
chinese
> and swiss german, and so on, rather than to just list these combinations,
> most of which you could determine pretty easily anyway if you gave what
you
> were doing a small amount of thought and had access to a list of country
> codes.
>
> What might be more useful is to say, here is the simplest form to identify
> this language (eg. 'en'), and in the next column are a bunch of potential
> country or other codes you may want to consider using in conjunction with
> this.  Rather than, "This table lists the languages" and " require a
> language subtag and country subtag".
>
> RI
>
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages@alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
Received on Thursday, 16 December 2004 15:45:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:04 GMT