W3C home > Mailing lists > Public > public-lod@w3.org > February 2012

Re: URIs for languages

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Thu, 16 Feb 2012 19:14:15 +0100
Message-ID: <CAK4ZFVF1QOiQZDxvyR69bsQRQqW2S3hwTJuoDLrEFXPTdP71WA@mail.gmail.com>
To: "M. Scott Marshall" <mscottmarshall@gmail.com>
Cc: Barry Norton <barry.norton@ontotext.com>, public-lod@w3.org, Lars Marius Garshol <larsga@ontopia.net>
Hi all

As creator and curator of lingvoj.org, I think I can give some explanations
of such mysteries :)

Lingvoj.org URIs included quite an arbitrary set of URI for languages, gory
details of the story starting in 2007 can be found at lingvoj.org main page.
To be in line with BCP 47 and e.g., values used in xml:lang tags, the
lingvoj.org URIs are based on ISO 639-1 (2 letters) codes when available.
For Japanese the lingvoj.org URI is therefore http://www.lingvoj.org/lang/ja,
which is actually redirected to http://lexvo.org/id/iso639-3/jpn since 2010
for reasons explained at the same page.
For Ancient Greek there is no 2-letters code hence the 3-letters code "grc"
is used (either ISO 639-2 or 639-3 in this case)
Lexvo.org URIs are all based on ISO 639-3 3-letters code, which is simpler.

Now this is part of a story which started even earlier, more than ten years
ago in OASIS Published Subjects Technical Committee with URIs such as
http://psi.oasis-open.org/iso/639/#grc (BTW still in use inside Mondeca
software) Lars Marius Garshol, editor is in cc.

I think now we should forget about URIs published by pionneer projects such
as OASIS TC, lingvoj.org and lexvo.org, and stick to URIs published by
genuine authority Library of Congress which is as close to the primary
source as can be. So if you want to use a URI for Ancient Greek as defined
by ISO 639-2, please use http://id.loc.gov/vocabulary/iso639-2/grc.

BTW Lars Marius, hello, what do you think? URIs at id.loc.gov are really
what we were dreaming to achieve in 2001, right?

Bernard



2012/2/16 M. Scott Marshall <mscottmarshall@gmail.com>

> I was planning to give the example URI for the Japanese language
> (stemming out of work at the Biohackathon 2011):
> http://lexvo.org/id/iso639-3/jpn
>
> BTW, I wasn't able to use the simpler URI scheme below for jpn as you
> had done with grc:
> http://www.lingvoj.org/lang/jpn
> ?
>
> -Scott
>
> On Thu, Feb 16, 2012 at 5:26 PM, Barry Norton <barry.norton@ontotext.com>
> wrote:
> >
> > http://www.lingvoj.org/lang/grc
> >
> > Barry
> >
> >
> >
> >
> > On 16/02/2012 16:15, Jordanous, Anna wrote:
> >
> > Hi LOD list,
> >
> > I am looking for URIs to use  to represent particular languages
> (primarily
> > Ancient Greek, Arabic, English and Spanish). This is to represent what
> > language a document is written in, in an RDF triple. I thought it would
> be
> > obvious how to refer to the language itself, but I am struggling.
> >
> > I would like to use something like the ISO 639 standard for languages. To
> > distinguish between Ancient Greek and Modern Greek, I have to use the
> > ISO-639-2 set of language codes. http://www.loc.gov/standards/iso639-2/(The
> > codes are grc and gre respectively)
> >
> > http://downlode.org/Code/RDF/ISO-639/ is an RDF representation of ISO
> 639
> > but it doesn’t include Ancient Greek as it only includes ISO-639-1
> > languages.
> >
> > As far as I see, I have the following options e.g. for Arabic
> > Use the
> > http://www.loc.gov/standards/iso639-2/php/langcodes_name.php?code_ID=22
> >
> http://www.loc.gov/standards/iso639-2/php/langcodes-keyword.php?SearchTerm=ara&SearchType=iso_639_2
> > http://www.loc.gov/standards/iso639-2#ara
> >
> >
> > This really must be simpler – what am I missing? Any comments welcomed.
> > Thanks for your help
> > anna
> >
> > ---
> > Anna Jordanous
> > Research Associate
> > Centre for e-Research
> > King's College London
> > Tel: +44 (0) 20 7848 1988
> >
> >
> >
> >
> >
>
>
>
> --
> M. Scott Marshall
> http://staff.science.uva.nl/~marshall
>
>


-- 
*Bernard Vatant
*
Vocabularies & Data Engineering
Tel :  + 33 (0)9 71 48 84 59
Skype : bernard.vatant
Linked Open Vocabularies <http://labs.mondeca.com/dataset/lov>

--------------------------------------------------------
*Mondeca**          **                   *
3 cité Nollez 75018 Paris, France
www.mondeca.com
Follow us on Twitter : @mondecanews <http://twitter.com/#%21/mondecanews>
Received on Thursday, 16 February 2012 18:15:02 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:37 UTC