W3C home > Mailing lists > Public > public-lod@w3.org > February 2012

Re: URIs for languages

From: Gerard de Melo <gdemelo@mpi-inf.mpg.de>
Date: Thu, 16 Feb 2012 11:21:22 -0800
Message-ID: <4F3D5732.7090304@mpi-inf.mpg.de>
To: Bernard Vatant <bernard.vatant@mondeca.com>
CC: "M. Scott Marshall" <mscottmarshall@gmail.com>, Barry Norton <barry.norton@ontotext.com>, public-lod@w3.org, Lars Marius Garshol <larsga@ontopia.net>
Hi Bernard,

> I think now we should forget about URIs published by pionneer projects 
> such as OASIS TC, lingvoj.org <http://lingvoj.org> and lexvo.org 
> <http://lexvo.org>, and stick to URIs published by genuine authority 
> Library of Congress which is as close to the primary source as can be. 
> So if you want to use a URI for Ancient Greek as defined by ISO 639-2, 
> please use http://id.loc.gov/vocabulary/iso639-2/grc.
>
> BTW Lars Marius, hello, what do you think? URIs at id.loc.gov 
> <http://id.loc.gov> are really what we were dreaming to achieve in 
> 2001, right?

Now of course I may be a bit biased here, but I do not believe that the 
id.loc.gov service solves
all of the problems. This is from the Lexvo.org FAQ [1]:
> The advantage of using those URIs is that they are maintained by the 
> Library of Congress. However, there are also several issues to 
> consider. First of all, ISO 639-2 is orders of magnitude smaller than 
> ISO 639-3 and for example lacks an adequate code for Cantonese, which 
> is spoken by over 60 million speakers.
> More importantly, the LOC's URIs do not describe languages per se but 
> rather describe code-mediated conceptualizations of languages. This 
> implies, for instance, that the French language 
> (<http://lexvo.org/id/iso639-3/fra>) has two different counterparts at 
> the LOC, <http://id.loc.gov/vocabulary/iso639-2/fra> and 
> <http://id.loc.gov/vocabulary/iso639-2/fre>, which each have slightly 
> different properties.
> Finally, connecting your data to Lexvo.org's information is likely to 
> be more useful in practical applications. It offers information about 
> the languages themselves, e.g. where they are spoken, while the LOC 
> mostly provides information about the codes, e.g. when the codes were 
> created and updated and what kind of code they are.
> In practice, you can also use both codes simultaneously in your data. 
> However, you need to be very careful to make sure that you are 
> asserting that a publication is written in French rather than in some 
> concept of French created on January, 1, 1970 in the United States.

Best,
Gerard

[1] http://www.lexvo.org/linkeddata/faq.html

-- 
Gerard de Melo [demelo@icsi.berkeley.edu]
http://www.icsi.berkeley.edu/~demelo/
Received on Thursday, 16 February 2012 19:22:05 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:37 UTC