W3C home > Mailing lists > Public > public-lod@w3.org > August 2008

Re: Visualizing LOD Linkage

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 02 Aug 2008 13:22:11 -0400
Message-ID: <489497C3.6020506@openlinksw.com>
To: Yves Raimond <yves.raimond@gmail.com>
CC: public-lod@w3.org

Yves Raimond wrote:
> On Sat, Aug 2, 2008 at 5:17 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote:
>   
>> Yves Raimond wrote:
>>     
>>> Hello!
>>>
>>>
>>>       
>>>> I would like to suggest that publishers of new linked data spaces that
>>>> plug
>>>> into the growing LOD include the following:
>>>>
>>>> 1. cross-link information
>>>>
>>>>         
>>> I would also suggest we find a better measure for interlinkage than a
>>> raw number of triples linking one dataset to another.
>>> For example, http://dbtune.org/musicbrainz/ creates its own identifier
>>> for languages (http://dbtune.org/musicbrainz/directory/language),
>>> which are owl:sameAs'ed to the corresponding languages in Lingvoj when
>>> applicable, whereas linkedmdb directly links to the Lingvoj
>>> identifiers. In the latter case, the raw number of interlinks will be
>>> higher, but could be reduced a lot by creating identifiers for
>>> language and use sameAs.
>>>
>>> The same applies for geographic locations, for example. Some datasets
>>> use foaf:based_near to link to Geonames, some others create their own
>>> identifiers, and then link to the corresponding Geonames locations
>>> through owl:sameAs. For the same dataset, this two methodologies will
>>> lead to completely different numbers.
>>>
>>> To boost the statistics of a dataset, we could simply link each person
>>> or group in them to http://dbpedia.org/class/yago/Entity100001740
>>> through rdf:type :-D
>>>
>>>       
>> Amen!
>>
>> And it also means we start to expose the fact that LOD is not an "instance
>> level only" linked data space (a sad misconception).
>>
>>     
>>> So I think we should agree on what we count as "interlinks" before
>>> publishing such statistics, so that we can actually use these values?
>>>
>>>       
>> We should basically express linkages across instance and schema/data
>> dictionary vectors. This also helps those looking to build LOD applications.
>>
>> Of course there is more to come re. the injection of "data dictionary /
>> schema" linkage aspects of LOD, but no harm in getting our thoughts in order
>> re. "best practices" for the growing cloud :-)
>>     
>>> My recommendation would be to always go for the lowest value - the one
>>> you'd obtain by creating your own identifiers and using owl:sameAs
>>> (which would be equivalent to the number of distinct external URIs
>>> mentioned in your dataset).
>>>
>>> What do you think?
>>>
>>>       
>> Good Idea, so share you page as a nice example :-)
>>
>>     
>
> I just gave it a shot on Jamendo, counting the results of a SELECT
> DISTINCT query, and this is indeed a bit depressing.
> http://dbtune.org/jamendo/
> For example, the Geonames interlinking drops from 3244 to 289 :-)
>   
Smarts vs Size, which do you choose?  I find this elating :-)

Kingsley
> Some similar statistics from Musicbrainz at
> http://dbtune.org/musicbrainz/ , which I'll publish when I get some
> time to figure out how to tweak d2r templates :-)
>
> Distinct DBpedia albums - 22426
> Distinct DBpedia artists - 39877
> Distinct MySpace artists (on http://dbtune.org/myspace/) - 14668
> Distinct DBpedia countries - 245
> Distinct Lingvoj languages - 185
>
> Cheers!
> y
>
>
>   
>> Kingsley
>>     
>>> Cheers!
>>> y
>>>
>>>
>>>
>>>       
>>>> 2. cross-link visual derived from the LOD cloud diagram.
>>>>
>>>> The Linked Movies Database has nice examples of both [1].
>>>>
>>>> Links:
>>>>
>>>> 1. http://www.linkedmdb.org:8080/Main/Interlinking
>>>>
>>>> --
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>>>> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>       
>> --
>>
>>
>> Regards,
>>
>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>>
>>
>>     
>
>   


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Saturday, 2 August 2008 17:22:48 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:17 UTC