Re: [open-linguistics] Re: ISO 639 URIs from Robert Forkel on 2020-07-08 (public-ontolex@w3.org from July 2020)

From: Robert Forkel <xrotwang@googlemail.com>
Date: Wed, 8 Jul 2020 13:33:33 +0200
To: Christian Chiarcos <christian.chiarcos@web.de>
Cc: Gilles Sérasset <Gilles.Serasset@univ-grenoble-alpes.fr>, open-linguistics <open-linguistics@googlegroups.com>, Linked Data for Language Technology Community Group <public-ld4lt@w3.org>, "public-ontolex@w3.org" <public-ontolex@w3.org>
Message-ID: <CAJhx5RfGt_RasiPum-4RZArP-HMgmzvXtghkVUmp85vGQa8kpw@mail.gmail.com>

Regarding the license terms for the ISO 639-3 code tables: This weird
"the product, system, or device does not provide a means to
redistribute the code set." clause is basically what kept me from
including the ISO code tables in
https://github.com/glottolog/glottolog - although our curation
software downloads and uses these to validate the Glottolog data. If
it were not for this, Glottolog might be a place with some sort of
institutional support that could provide resolvable URLs for all ISO
codes. We are working towards having complete coverage of ISO 639-3 -
even if this might mean "not assessed yet" or "bookkeeping" status for
the associated Glottolog languoid.

On Wed, Jul 8, 2020 at 1:24 PM Christian Chiarcos
<christian.chiarcos@web.de> wrote:
>
> Am .07.2020, 11:46 Uhr, schrieb Gilles Sérasset
> <Gilles.Serasset@univ-grenoble-alpes.fr>:
>
> > Hi Christian, hi all,
> >
> > Wouldn’t it be nice if the lexvo.org domain was managed by a group of
> > persons from the LLOD area to provide linked data on the languages that
> > would be an aggregation of all the datasets you mentioned, along with
> > all “sameAs” relations ?
>
> Definitely, it might find support in this community (definitely mine), and
> as you describe it, it is not even be a big effort to create that. But the
> question is how to make that sustainable and to keep it alive (maintained
> and funded) in the long run.
>
> > This solution will involve a dedicated team of maintainers (on the long
> > run) and a rather small infrastructure to provide the data (which could
> > be simply served from static files + content negotiation).
>
> I think it would also require some kind of organizational commitment to
> keep it alive on a technical level. This would be one of the strengths of
> IANA or (maybe) SIL. There may be other alternatives to these, though.
>
> > It assumes that the generation of URIs and accompanying data can be made
> > entirely automatically (which may not be the case if there are name
> > clashes among these).
>
> ISO 693 codes should not clash
> (https://www.loc.gov/standards/iso639-2/iso639jac.html).
>
> > It also assumes that the different dataset licences allows for it (which
> > I am unsure regarding SIL…).
>
> The terms of use (https://iso639-3.sil.org/code_tables/download_tables)
> permit commercial and non-commercial use with attribution and without
> modification, but require that "the product, system, or device does not
> provide a means to redistribute the code set."
>
> I am not sure what this means. Clearly lexvo and the datahub ISO tables
> provide a means to reconstruct the full code set, but apparently that
> hasn't been an issue in the last 10 years, also because these are no
> verbatim copies.
>
> > I also think that such an alternate dataset may be necessary for other
> > persons who will need to have more information attached to the language
> > they deal with (e.g. date annotations for Historical languages,
> > geographical (space/time) annotation for all languages, etc.).
>
> Absolutely. Glottolog has been a great step in this direction for minority
> languages, but for historical languages, nothing really is in existence.
> But maybe let's separate the discussions for extending ISO 693 data (which
> is necessary on many dimensions) from the question how to create
> sustainable identifiers. I could imagine existing organizations taking
> care of just providing an RDF view on ISO 639-3 data, but everything
> beyond that probably requires external funding (and of course, this is
> something we can work towards, too).
>
> Best,
> Christian
>
> --
> You received this message because you are subscribed to the Google Groups "open-linguistics" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to open-linguistics+unsubscribe@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/open-linguistics/op.0nfom0i1br5td5%40kitaba.

Received on Wednesday, 8 July 2020 11:33:57 UTC