W3C home > Mailing lists > Public > public-lod@w3.org > February 2011

Re: Proposal to assess the quality of Linked Data sources

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Fri, 25 Feb 2011 23:08:44 +0000
To: Kingsley Idehen <kidehen@openlinksw.com>
CC: Bernard Vatant <bernard.vatant@mondeca.com>, Annika Flemming <annika.flemming@gmx.de>, "<public-lod@w3.org>" <public-lod@w3.org>, Bob Ferris <zazi@elbklang.net>
Message-ID: <EMEW3|a9095ad2852566b8cc1880fa71988b2an1ON9F02hg|ecs.soton.ac.uk|306C7E04-9E71-4C3A-BC84-1EC754565115@ecs.soton.ac.uk>

On 25 Feb 2011, at 23:00, Kingsley Idehen wrote:

>> Hi Annika
>> 
>> - "A vocabulary is said to be established, if it is one of the 100 most popular vocabularies stated on pre x.cc" - uhm, as the results from Richard's evaluation have, this is quite arguable
>> It's a practical way to determine it (which I can use for the implementation of the formalism). Another way would be to compare many documents from many data sources and to find out, which vocabularies are most popular.
>> 
>> I'm particularly interested in this aspect of vocabulary selection. Regarding popularity, I fully go along with Bob regarding prefix.cc in which all sorts of biases can be introduced. I think the popularity is better measured by the use of vocabularies in CKAN datasets, as indicated by "format-*" tags. See http://ckan.net/tag/?page=F and for example http://ckan.net/tag/format-bibo or http://ckan.net/tag/format-foaf.
> 
> Why not actual link coefficient from an LOD Cloud cache instance ? That a least shows what's being used :-)
There is no LOD Cloud cache instance as far as I can tell.
So any attempt to infer data from something that claimed to be would be misleading.
Cheers
Hugh
> 
> Kingsley
>> 
>> Another approach I'm currently working on is the one you can find at http://labs.mondeca.com/dataset/lov. The description of interlinked vocabularies (using VOAF vocabulary) provide indication of popularity at the vocabulary level itself. From this dataset (still far from exhaustive of         course) you can see which vocabularies are reused, extended, used for annotation by other ones. I think the density of links to and from a vocabulary to other ones gives a good indicator of its "establishment", in combination with the number of datasets actually using it.
>> 
>> Best
>> 
>> Bernard
>> 
>> 
>> -- 
>> Bernard Vatant
>> Senior Consultant
>> Vocabulary & Data Engineering
>> Tel:       +33 (0) 971 488 459
>> Mail:     bernard.vatant@mondeca.com
>> ----------------------------------------------------
>> Mondeca
>> 3, cité Nollez 75018 Paris France
>> Web:    http://www.mondeca.com
>> Blog:    http://mondeca.wordpress.com
>> ----------------------------------------------------
> 
> 
> -- 
> 
> Regards,
> 
> Kingsley Idehen	      
> President & CEO 
> OpenLink Software     
> Web: 
> http://www.openlinksw.com
> 
> Weblog: 
> http://www.openlinksw.com/blog/~kidehen
> 
> Twitter/Identi.ca: kidehen 
> 
> 
> 
> 
> 

-- 
Hugh Glaser,  
              Intelligence, Agents, Multimedia
              School of Electronics and Computer Science,
              University of Southampton,
              Southampton SO17 1BJ
Work: +44 23 8059 3670, Fax: +44 23 8059 3045
Mobile: +44 78 9422 3822, Home: +44 23 8061 5652
http://www.ecs.soton.ac.uk/~hg/
Received on Friday, 25 February 2011 23:09:55 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:31 UTC