Re: Classification of open datasets... from Peter Krantz on 2013-03-04 (public-egov-ig@w3.org from March 2013)

From: Peter Krantz <peter@peterkrantz.se>
Date: Mon, 4 Mar 2013 21:14:03 +0100
To: Phil Archer <phila@w3.org>
Cc: euopendata@lists.okfn.org, public-egov-ig <public-egov-ig@w3.org>
Message-ID: <CAGtW=MuNs8MPiT1r12QMNkYMQWmiNNs0OohOdAmhv9yeoWmZrQ@mail.gmail.com>

2013/3/4 Phil Archer <phila@w3.org>:
>
> You've kicked off a lot of discussion - did you get a satisfactory answer?

Yes, thank you, a lot of good feedback and a list of potential
candidates for dataset topic annotations. It seems that a mix of a big
broad taxonomy in combination with smaller national one has been used
in some cases.

When looking at use cases it is also important to make it easy to
classify datasets. Having a huge taxonomy to choose from may make it
difficult to use. Limiting the choice to top levels may make it
easier. If all we want to do is to facilitate discovery of similar
datasets that may be good enough.

On a side note there seems to be a lack of simple tools to create your
own material in SKOS. I have only found a handful that seems usable
for non-tech people (e.g. iQvoc).


>
> The NACE codes - that describe company activity - are based on the UN's ISIC
> codes and it all gets turned into a country-specific set known as SIC codes
> here in UK. What on Earth is a data publisher to do?

Yes, NACE looks pretty simple to use and probably covers a lot of
topics. The relation could be <dataset> ---created from the
activity--- <nace code>.


> I don't think there is a single answer. Creating a global "everyone should
> use this central list of enumerated terms" list is the way forward.
> *However* it does seem entirely reasonable to me for a data consumer or
> service operator to say "this is the data I understand, please use
> controlled vocab lists A, B or C if you want me to understand you." And, in
> similar vain maybe, something like: "you're free to use any of
> skos:prefLabel, rdfs:label and dcterms:title but in *my* application I treat
> them all the same."


An idea could be to use a subset of your national wikipedia data for
topics. Wikipedia is managed and  wikipedia articles in one language
are linked to their counterpart in other languages. They also have
categories which facilitate (some) hierarchy. Using wikipedia
(DBPedia) links it would be fairly easy to follow relations to other
datasets.

Regards,

Peter Krantz
http://www.peterkrantz.com
@peterkz_swe

Received on Monday, 4 March 2013 20:14:34 UTC