Re: Language Identifier List up for comments from Tex Texin on 2004-12-14 (www-international@w3.org from October to December 2004)

From: Tex Texin <tex@xencraft.com>
Date: Tue, 14 Dec 2004 11:02:30 -0800
To: Richard Ishida <ishida@w3.org>
CC: www-international@w3.org, ietf-languages@alvestrand.no
Message-ID: <41BF38C6.E8B7C5BC@xencraft.com>

I agree, and that's why we need to provide more guidance than we have done to
date.

Richard Ishida wrote:
> 
> Comments:
> 
> [1] For Chinese: What about zh-Hans and zh-Hant?  What about the IANA stuff
> like zh-hakka, etc.?
> 
> [2] What if I just want to say "This is Turkish - but I don't know which
> dialect"?  The list makes it seem like I *need* to choose one of the country
> variants.
> 
> [3] Is there a big enough difference between en-GB and, say, en-FK that I
> should need to distinguish between the two?
> 
> [4] I'm not clear about the value of the list.  A list like this suggests to
> me that things can be looked up here without a great deal of thought.  I'm
> not convinced that that is true.  And once one applies a little thought
> about the most appropriate label to use, it is hardly difficult to come up
> with the appropriate country code.  Perhaps there would be a minimal value
> in helping find some of the country codes you might need, but then I would
> organise the information slightly differently.
> 
> [5] I think the choice of language code also depends on the intended usage.
> That is very hard to predict, of course.  If one is simply applying a
> different font to English text embedded in an Arabic document, then I think
> labelling with subcodes is overkill.  If labelling English text for use with
> a spell checker, a distinction between en-US and en-GB is typically useful
> because spell checkers for English tend to take that distinction into
> account - whether that applies for all variants of other languages is not
> clear to me.  If dealing with a text to speech application that can
> distinguish accents such as en-UK-scouse, then a higher level of detail is
> needed than that given in the table. If dealing with Accept-Language
> declarations, then you must declare both en and en-UK/en-US in a browser,
> otherwise you won't always get the results you expected. I think the table
> over-simplifies the question.  I'll concede that the answer to the question
> is very difficult to produce, but my concern is that the table seems to be
> offering a solution, by fiat, that is not always correct, and doesn't say
> that clearly enough.
> 
> [6] typo: Lingala uses an upper case 'I'
> 
> RI
> 
> ============
> Richard Ishida
> W3C
> 
> contact info:
> http://www.w3.org/People/Ishida/
> 
> W3C Internationalization:
> http://www.w3.org/International/
> 
> Publication blog:
> http://people.w3.org/rishida/blog/
> 
> 
> 
> > -----Original Message-----
> > From: www-international-request@w3.org
> > [mailto:www-international-request@w3.org] On Behalf Of Tex Texin
> > Sent: 14 December 2004 10:43
> > To: www-international@w3.org
> > Cc: www-international@w3.org; ietf-languages@alvestrand.no
> > Subject: Language Identifier List up for comments
> >
> >
> > http://www.i18nguy.com/unicode/language-identifiers.html
> >
> > I will add caveats and expand the list to be both one level
> > and two level as we go along.
> >
> > I am in a busy patch, so comment now, but I won't make many
> > updates until the weekend.
> >
> > tex
> >
> >

-- 
-------------------------------------------------------------
Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
Xen Master                          http://www.i18nGuy.com
                         
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------

Received on Tuesday, 14 December 2004 19:02:42 UTC