Re: URIs for languages

Hi Bernard, Gerard, (and now Lars),

Thanks for the pointers. It seems like we are better off pointing
directly to lexvo if we want URIs that will

1) enable us to precisely and unambiguously refer to any official
language (including, for example, Cantonese)

2) provide the name of the language in many languages (potentially
useful for search indexes and labels in applications).

However, there is a URI longevity issue whenever PURLs are not used
(see full explanation of issues at http://sharedname.org ). Providing
a neutral namespace that can be redirected when domain names change is
the most effective way to create a persistent URI that won't contain
historical artifacts when the 'name brand'-based domain name changes
(as has been repeatedly demonstrated by history). So, ideally, an
organization with long-term governance (not project bound) would
maintain a namespace such as http://sharedname.org/lang/ that could be
redirected from lexvo to future-lexvo domains/URLs.

[Lars - your message came in just as I was about to press <send>. I'm
confused by your reply. What about the problems with LOC lang ids that
Gerard pointed out? Is that what you meant by "If only they could do
ISO 3166 countries as well..."?]

Best,
Scott

On Thu, Feb 16, 2012 at 8:21 PM, Gerard de Melo <gdemelo@mpi-inf.mpg.de> wrote:
> Hi Bernard,
>
>
> I think now we should forget about URIs published by pionneer projects such
> as OASIS TC, lingvoj.org and lexvo.org, and stick to URIs published by
> genuine authority Library of Congress which is as close to the primary
> source as can be. So if you want to use a URI for Ancient Greek as defined
> by ISO 639-2, please use http://id.loc.gov/vocabulary/iso639-2/grc.
>
> BTW Lars Marius, hello, what do you think? URIs at id.loc.gov are really
> what we were dreaming to achieve in 2001, right?
>
>
> Now of course I may be a bit biased here, but I do not believe that the
> id.loc.gov service solves
> all of the problems. This is from the Lexvo.org FAQ [1]:
>
> The advantage of using those URIs is that they are maintained by the Library
> of Congress. However, there are also several issues to consider. First of
> all, ISO 639-2 is orders of magnitude smaller than ISO 639-3 and for example
> lacks an adequate code for Cantonese, which is spoken by over 60 million
> speakers.
> More importantly, the LOC's URIs do not describe languages per se but rather
> describe code-mediated conceptualizations of languages. This implies, for
> instance, that the French language (<http://lexvo.org/id/iso639-3/fra>) has
> two different counterparts at the LOC,
> <http://id.loc.gov/vocabulary/iso639-2/fra> and
> <http://id.loc.gov/vocabulary/iso639-2/fre>, which each have slightly
> different properties.
> Finally, connecting your data to Lexvo.org's information is likely to be
> more useful in practical applications. It offers information about the
> languages themselves, e.g. where they are spoken, while the LOC mostly
> provides information about the codes, e.g. when the codes were created and
> updated and what kind of code they are.
> In practice, you can also use both codes simultaneously in your data.
> However, you need to be very careful to make sure that you are asserting
> that a publication is written in French rather than in some concept of
> French created on January, 1, 1970 in the United States.
>
>
> Best,
> Gerard
>
> [1] http://www.lexvo.org/linkeddata/faq.html
>
> --
> Gerard de Melo [demelo@icsi.berkeley.edu]
> http://www.icsi.berkeley.edu/~demelo/

Received on Friday, 17 February 2012 09:45:58 UTC