Resolving ISSUE-26 (range of dcterms:language) from Richard Cyganiak on 2012-10-28 (public-gld-wg@w3.org from October 2012)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Sun, 28 Oct 2012 17:26:45 +0100
To: public-gld-wg@w3.org
Message-Id: <E0D427F9-6ED7-436B-B91E-936E8EFD8414@cyganiak.de>

After further off-line discussion with Makx, Dave and Phil, I retract my earlier proposal to use xsd:language-datatyped literals as values for dcterms:language. Here is a new proposal:

[[
PROPOSAL: In DCAT-conformant data, values of dcterms:language MUST be members of some subclass, and SHOULD be ISO-639 URIs as defined by the Library of Congress in http://id.loc.gov/vocabulary/iso639-1.html and http://id.loc.gov/vocabulary/iso639-2.html . The iso639-1 codes should be preferred, and iso639-2 codes used only when no iso639-1 code is available for a language. This resolves ISSUE-26
]]

The reasons are: 1. The value space of xsd:language is defined as a subset of the lexical space. This means that xsd:language-typed literals denote strings, not languages. 2. The Library of Congress is one of the registration authorities for ISO-639, and this gives them excellent credentials as maintainers of a URI scheme for ISO-639 codes.

Here are some example statements, for English and Cheyenne:

    <xxx> dcterms:language <http://id.loc.gov/vocabulary/iso639-1/en>.

    <xxx> dcterms:language <http://id.loc.gov/vocabulary/iso639-2/chy>.

Best,
Richard

Received on Sunday, 28 October 2012 16:32:13 UTC