W3C home > Mailing lists > Public > www-international@w3.org > January to March 2012

Re: How to language tag language tags?

From: Kent Karlsson <kent.karlsson14@telia.com>
Date: Thu, 05 Jan 2012 16:21:44 +0100
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, 'WWW International' <www-international@w3.org>
CC: Stephane Bortzmeyer <bortzmeyer@nic.fr>
Message-ID: <CB2B7E98.1C8E7%kent.karlsson14@telia.com>

Den 2012-01-05 12:07, skrev "Leif Halvard Silli"

> If I wanted to create a HTML version of the language subtag registry,

You're not the first to have a similar idea. See

> then I would tag the entire registry with lang="en" (<html lang="en">),

Most of the names and all comments are (sort of) in English. Though it
might be hard to argue that (e.g.) "ǁani" (malwritten as "//Ani") is
really English.

In addition some of the language (alternative) names ("descriptions")
are definitely not English. E.g. "finlandssvenskt teckenspråk",
"suomenruotsalainen viittomakieli". Don't ask me why the ISO 639-3
registry took in these particular non-English alternative names, when
it has not done so in general.

Further, the entire registry is in a strictly controlled formal
language, even though the field names are English words. Compare
most programming languages (and HTML for that matter). Even though
most key words, and even most variable names are "sort of English",
a computer program in programming language so-and-so isn't in English.

> while each entr in the registry perhaps could look like this:
> <hr/>
> <p>Type: language<br/>
>    Subtag: <dfn lang="zxx">aa</dfn><br/>
>    Description: Afar<br/>
>    Added: 2005-10-16</p>
> Question: Do you agree with the choice of language tag for the <dfn>
> element around the very language tag?

I'm not sure a language subtag is a "definition term" (<dfn>) at all...
To me a "definition term" should be a term, i.e. something used in a
natural language (or may occur inline in a text in a natural language).
I would include abbreviations/acronyms, but language subtags aren't

    /Kent K

> The chose 'zxx' ('No linguistic content') is based on
> <http://people.w3.org/rishida/utils/subtags/index.php?find=linguist&submit=Fin
> d>, 
> which lists "machine-readable data files consisting of machine
> languages or character codes, programming source code, etc" as example
> of things which could use this tag.
> Leif H Silli
Received on Thursday, 5 January 2012 15:22:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 5 January 2012 15:22:33 GMT