W3C home > Mailing lists > Public > public-esw-thes@w3.org > April 2011

Re: Quality Criteria for SKOS vocabularies

From: Alistair Miles <alimanfoo@googlemail.com>
Date: Tue, 12 Apr 2011 11:18:53 +0100
To: Christian Mader <christian.mader@univie.ac.at>
Cc: public-esw-thes@w3.org
Message-ID: <20110412101853.GA3466@aliman-desktop>
Hi Christian,

On Mon, Apr 11, 2011 at 08:46:57AM +0200, Christian Mader wrote:
> Hi,
> 
> In the course of my PhD project at the University of Vienna I'm
> going to address the question how to programmatically support
> collaborative creation of "good-quality" SKOS vocabularies. I have
> found 14 criteria that, in my opinion, could be used to assess the
> quality of said vocabularies. It would be really helpful for me to
> get some community input on these criteria, so I published them
> here:
> 
> https://github.com/cmader/qSKOS/wiki/Quality-Criteria-for-SKOS-Vocabularies
> 
> Please feel free to post your comments and suggestions regarding
> that matter, every kind of input will be warmly appreciated.

This is a great idea, and really useful. A few thoughts on your draft...

VQC1 & VQC2 -- both great ideas, and could be very valuable when trying to get 
an initial sense of how structured a concept scheme is.

VQC3 -- yes, very useful in practice.

VQC6 -- this is a fascinating idea, and could probably form the basis for a 
substantial piece of work all by itself. If you could develop some ranking 
metrics, I think that would be extremely useful.

VQC7: Multiple Relations -- this seems ambiguous. Are you looking for cycles 
(which is already covered by VQC3) or are you looking for the presence of 
redundant triples? If the latter, then you could also ask what "quality" means, 
because for some people it might be more useful to have a vocabulary published 
*including* additional triples derived by various inference processes, whereas 
others might prefer to have the lean graph (i.e., no redundant triples), which 
could take different forms depending on what you choose to assert (e.g., some 
might assert skos:broader only, some might assert skos:narrower only). You might 
also like to know more fine-grained details, like, e.g., does computing the 
closure of rdfs:subPropertyOf and rdfs:subClassOf add any new triples (this is 
highly relevant if the dataset uses class and/or property extensions to SKOS)? 
Or, does computing entailments that follow from semantic conditions S18-26 add 
any new triples? This is of practical value, e.g., when trying to query a 
dataset using SPARQL, because some queries will depend on the presence of these 
inferred triples (e.g., can I query using skos:broaderTransitive?), and the type 
of query you use will depend on what triples are asserted.

VQC8 -- I think I would split this into separate criteria, one for SKOS 
consistency (i.e., the data are consistent with semantic conditions stated in 
the SKOS reference), and one for SKOS vocabulary (i.e., does not invent any new 
terms within the SKOS namespace).

That's all that springs to mind for now, hope that's helpful.

Cheers,

Alistair

> 
> Best,
> Christian
> 
> -- 
> Research Group Multimedia Information Systems
> Department of Distributed and Multimedia Systems
> Faculty of Computer Science
> University of Vienna
> 
> Postal Address: Liebiggasse 4/3-4, 1010 Vienna, Austria
> Phone: +43 1 4277 39623, Fax: +43 1 4277 39649
> E-Mail: christian.mader@univie.ac.at
> 
> 

-- 
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
The Wellcome Trust Centre for Human Genetics
Roosevelt Drive
Oxford
OX3 7BN
United Kingdom
Web: http://purl.org/net/aliman
Email: alimanfoo@gmail.com
Tel: +44 (0)1865 287669
Received on Tuesday, 12 April 2011 10:19:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 12 April 2011 10:19:26 GMT