Re: Language Identifier List up for comments from Tex Texin on 2004-12-17 (www-international@w3.org from October to December 2004)

From: Tex Texin <tex@xencraft.com>
Date: Fri, 17 Dec 2004 01:12:28 -0800
To: Martin Duerst <duerst@w3.org>
CC: www-international@w3.org, ietf-languages@alvestrand.no
Message-ID: <41C2A2FC.EB95A72C@xencraft.com>
1) I am not sure about moving to just the ietf list- there were several
messages that were just on www-international and some folks that commented are
not on that list.
However, I'll honor the request and just discuss on ietf-languages. If people
want to subscribe, there is a link
to list archives on my page, and that will take you to subscription info.
http://www.i18nguy.com/unicode/language-identifiers.html

2) Yes, much more can be said about what the list represents and how it should
be used etc.
Let's see what we conclude and then fix the text, unless it is hampering
fleshing out the tables.
I am fine with your suggested text, and several others have proposed various
caveats and viewpoints.
I don't think there is agreement about how the list should be used or what it
means and would rather simply discuss what the recommendations if any should
be, before I invest the time in collaborative editing.

3) We can remove the secondary tags in table 1 at some future time. For now, it
helps at least me if not others to identify the likely source and recognize
cases where it is used elsewhere differently. I am not worried about confusing
people today, as the only people interacting with the page are people who are
interested and understand the context.
Anyone who has followed the thread or understands the word "draft" in big
letters, would know not to use it as official recommendations.

4) I used the non-breaking hyphens so the tags wouldn't be broken in line
wrapping. When the content settles down, I'll address the hyphens.

5) Yes, it is a significant problem that many of the languages en, es, de, et
al. have useless values mixed in with regions of distinct language. How shall
we decide these? I don't question your assessment for de, but I would like to
know how to get answers for the remaining languages. If I want to ask a Bengali
expert for example, about which of the
subcodes listed are useful, what criteria should I provide them to answer the
question?

6) For the codes that you suggest should be "retired", how should content
providers from those regions use for labels?
What does one use if the author is from Belgium or Denmark?
Do they become de, or de-AT, or de-DE, or de-CH?
I assume they shouldn't be just retired, but grouped together and assigned a
common label.
Just as many english speakers are put into either of en-US or en-GB camps,
despite being from one of the other english-speaking regions.

That's the question that I wanted to answer when I created the list.

OK, from here on out we'll discuss on ietf-languages.
tex


Martin Duerst wrote:
> 
> At 19:43 04/12/14, Tex Texin wrote:
>  >
>  >http://www.i18nguy.com/unicode/language-identifiers.html
> 
> Some comments on the current version of the document:
> 
> - I think there has been enough cross-posting. I suggest we all
>    limit further posts to ietf-languages@alvestrand.no.
>    Please direct followups only to that list.
> 
> - The intro says "The guidelines for choosing between language
>    and language-country are ambiguous." and then goes on as if
>    complete clarity would eventually be reached. I think it's
>    important to say that this document is here to help (once it's
>    more complete), but that it's ultimately the tagger's decision,
>    and that in many cases, there may be choices left.
> 
> - "Proposed List of 1-level Language Identifiers": Why on earth
>    are two-level codes given when it says that one-level codes
>    are the right thing to use? Please, please, don't confuse
>    the readers with such stuff, and remove the country codes
>    from the identifiers as quickly as possible.
> 
> - Lots of codes have the wrong hyphens, e.g. ar&#x2011;DZ in source
>    code. This isn't an exercise in typographic subtlety; if people
>    can't cut-and-paste these codes, or paste garbage, that's a problem.
> 
> - The two-level list needs a lot of work. For German, for example,
>    out of "de-AT, de-BE, de-CH, de-DE, de-DK, de-LI, de-LU", I'd
>    probably only leave de-at, de-ch, and de-de.
> 
> - It would probably be good to reorganize that list, to separate
>    two-level codes that are recommended from those that aren't
>    recommended.
> 
> Regards,     Martin.

-- 
-------------------------------------------------------------
Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
Xen Master                          http://www.i18nGuy.com
                         
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------
Received on Friday, 17 December 2004 09:12:33 UTC