- From: Gannon Dick <gannon_dick@yahoo.com>
- Date: Thu, 17 Apr 2014 16:05:25 -0700 (PDT)
- To: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>, Linking Open Data <public-lod@w3.org>, Bernard Vatant <bernard.vatant@mondeca.com>
Perhaps you want to take my suggestion for handling codes ...
http://lists.w3.org/Archives/Public/public-lod/2014Apr/0105.html
The codes are a shorthand for links and labels. By using a lookup table with 1461 (possibly duplicate entries) you can create a map (of synthetic bi-annual versions) which will keep the labels in sync with the permalinks, and update automatically. The problem is that if you pare down the valid code list on a per application basis the abilities of the applications diverge.
For example, ET (upper case) is the Country Code for Ethiopia (http://id.loc.gov/vocabulary/countries/et) and et (lower case) is the ISO-639-1 code for Estonian (http://id.loc.gov/vocabulary/iso639-1/et). There is no semantics involved, just a lot of confusion involved when everybody knows their favorite 10 codes of each.
There is an initial problem of using a single code set to begin with, but going forward, it would be worth the effort to fix inconsistancies. RDF can be quite a mess without some forethought.
--Gannon
--------------------------------------------
On Wed, 4/16/14, Bernard Vatant <bernard.vatant@mondeca.com> wrote:
Subject: Incorrect lang tags Re: Princeton WordNet RDF
To: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>, "Linking Open Data" <public-lod@w3.org>
Date: Wednesday, April 16, 2014, 4:56 PM
John
Looking at the data in more details, it appears that
the lang tags are using systematically ISO 639-2 codes (3
letters-code), even when the ISO 639-1 exists and should be
used, as per BCP
47.
See e.g., http://www.w3.org/RDF/Validator/rdfval?URI=http%3A%2F%2Fwordnet-rdf.princeton.edu%2Fwn31%2F109637345-n.rdf
The W3C validator is right except when not up-to-date
with the last ISO 639 values like in :
Error: {W116} ISO-639 does not define language:
'zsm'.[Line = 53, Column = 50]
Nope, there is such a code in ISO 639-3 :)
See http://www.lingvoj.org/languages/tag-zsm.html
and source http://www-01.sil.org/iso639-3/documentation.asp?id=zsm
Hope you can fix this
easily!
Bernard
2014-04-16 15:30
GMT+02:00 John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>:
Princeton
University in collaboration with the Cognitive Interaction
Technology
Excellence Center of Bielefeld University are proud to
announce the first
RDF version of WordNet 3.1, now available at:
http://wordnet-rdf.princeton.edu/
This version, based on the current development of the
WordNet project,
intends to be a nucleus for the Linguistic Linked Open Data
cloud and the global
WordNet projects. The data are accessible in five formats
(HTML+RDFa, RDF/XML,
Turtle, N-Triples and JSON-LD) as well as by querying a
SPARQL endpoint.
The model is itself based on the lemon model and
follows the guidelines
of the W3C OntoLex Community Group.
We have incorporated direct links to the previous W3C
WordNets, UBY, Lexvo.org, VerbNet as well as translations
collected
by the Open Multilingual WordNet Project. Furthermore, we
include links
within the resource for previous versions of WordNets to
further enable
linking. We are interested in incorporating any resources
that are linked to
WordNet and would greatly appreciate suggestions.
Regards,
John P. McCrae, Christiane Fellbaum & Philipp
Cimiano
--
Bernard Vatant
Vocabularies & Data Engineering
Tel
: +
33 (0)9 71 48 84 59
Skype
: bernard.vatant
http://google.com/+BernardVatant
--------------------------------------------------------
Mondeca
35 boulevard de Strasbourg 75010 Paris
www.mondeca.comFollow
us on Twitter : @mondecanews
----------------------------------------------------------
Received on Thursday, 17 April 2014 23:05:55 UTC