W3C home > Mailing lists > Public > public-esw-thes@w3.org > April 2011

Re: Quality Criteria for SKOS vocabularies

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Tue, 12 Apr 2011 23:27:57 +0200
Message-ID: <4DA4C3DD.9030804@few.vu.nl>
To: public-esw-thes@w3.org
Hi Christian, Simon,

Interesting stuff!

On the internal graph-related measures you may want to add things like depth and branchiness of broader/narrower hierarchy (average, max), branchiness, etc.
Lexical coverage, though basic, is also really interesting to measure: average number of altLabels and hiddenLabels, languages covered, etc.

And of course there are many more abstract quality criteria that comes to mind: completeness of conceptual coverage of domain, correctness of relations wrt. more-or-less strict ontological criteria (a-la Guarino for formal ontologies; or see the work of Simon Spero on LCSH for some good examples in the "soften" KOS  domain [1]). But these are much more difficult to assess without manual evaluation!

@Simon: one good tool for checking [1,2] is http://validator.linkeddata.org/



[3]  http://dcpapers.dublincore.org/ojs/pubs/article/viewArticle/937

> Hi Christian,
> If different versions of a vocabulary are maintained then they could be used to measure changes to it. Quality criteria could be:
> - Up-to-dateness (when was the last change?)
> - Number of structural changes (more of these could decrease the quality because you can't rely on the vocabulary being stable)
> - Number of documentary changes (these would seem to indicate that the documentation gets fixed and completed so more are better)
> The last two should be seen relative to the amount of time between the changes.
> Maybe you could also take into account (if such data is available) how many editors collaborated on the vocabulary.
> For the Linked Data aspect maybe the W3C group notes [1] and [2] are relevant. I think I've seen a website somewhere that tests for these recipes but I can't remember where now. I've seen too many ontologies out there that are served with the wrong content type or aren't reachable by resolving their URIs and I think that's a very important aspect for SKOS vocabularies as well.
> A very basic requirement in a similar vein would be if the file is parsable at all, so if it has any syntax errors in the format being used. VQC8 is similar but I guess if the file isn't machine readable you can't even get as far as calculating that. :-)
> Do you think it would make sense to attach a weight to each criterion so that you can calculate an overall quality index? Or do you think this depends too much on the requirements of the vocabulary user?
> And do you want to list any criteria that aren't easily computable? Like how well researched and referenced the concepts are and how well they are grounded in reality or scientific research in the area.
> Regards,
> Simon
> [1] http://www.w3.org/TR/swbp-vocab-pub/
> [2] http://www.w3.org/TR/cooluris/
> Christian Mader wrote:
>> Hi,
>> In the course of my PhD project at the University of Vienna I'm going to address the question how to programmatically support collaborative creation of "good-quality" SKOS vocabularies. I have found 14 criteria that, in my opinion, could be used to assess the quality of said vocabularies. It would be really helpful for me to get some community input on these criteria, so I published them here:
>> https://github.com/cmader/qSKOS/wiki/Quality-Criteria-for-SKOS-Vocabularies
>> Please feel free to post your comments and suggestions regarding that matter, every kind of input will be warmly appreciated.
>> Best,
>> Christian
Received on Tuesday, 12 April 2011 21:26:31 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:14 UTC