- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Sat, 02 Aug 2008 13:22:11 -0400
- To: Yves Raimond <yves.raimond@gmail.com>
- CC: public-lod@w3.org
Yves Raimond wrote: > On Sat, Aug 2, 2008 at 5:17 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote: > >> Yves Raimond wrote: >> >>> Hello! >>> >>> >>> >>>> I would like to suggest that publishers of new linked data spaces that >>>> plug >>>> into the growing LOD include the following: >>>> >>>> 1. cross-link information >>>> >>>> >>> I would also suggest we find a better measure for interlinkage than a >>> raw number of triples linking one dataset to another. >>> For example, http://dbtune.org/musicbrainz/ creates its own identifier >>> for languages (http://dbtune.org/musicbrainz/directory/language), >>> which are owl:sameAs'ed to the corresponding languages in Lingvoj when >>> applicable, whereas linkedmdb directly links to the Lingvoj >>> identifiers. In the latter case, the raw number of interlinks will be >>> higher, but could be reduced a lot by creating identifiers for >>> language and use sameAs. >>> >>> The same applies for geographic locations, for example. Some datasets >>> use foaf:based_near to link to Geonames, some others create their own >>> identifiers, and then link to the corresponding Geonames locations >>> through owl:sameAs. For the same dataset, this two methodologies will >>> lead to completely different numbers. >>> >>> To boost the statistics of a dataset, we could simply link each person >>> or group in them to http://dbpedia.org/class/yago/Entity100001740 >>> through rdf:type :-D >>> >>> >> Amen! >> >> And it also means we start to expose the fact that LOD is not an "instance >> level only" linked data space (a sad misconception). >> >> >>> So I think we should agree on what we count as "interlinks" before >>> publishing such statistics, so that we can actually use these values? >>> >>> >> We should basically express linkages across instance and schema/data >> dictionary vectors. This also helps those looking to build LOD applications. >> >> Of course there is more to come re. the injection of "data dictionary / >> schema" linkage aspects of LOD, but no harm in getting our thoughts in order >> re. "best practices" for the growing cloud :-) >> >>> My recommendation would be to always go for the lowest value - the one >>> you'd obtain by creating your own identifiers and using owl:sameAs >>> (which would be equivalent to the number of distinct external URIs >>> mentioned in your dataset). >>> >>> What do you think? >>> >>> >> Good Idea, so share you page as a nice example :-) >> >> > > I just gave it a shot on Jamendo, counting the results of a SELECT > DISTINCT query, and this is indeed a bit depressing. > http://dbtune.org/jamendo/ > For example, the Geonames interlinking drops from 3244 to 289 :-) > Smarts vs Size, which do you choose? I find this elating :-) Kingsley > Some similar statistics from Musicbrainz at > http://dbtune.org/musicbrainz/ , which I'll publish when I get some > time to figure out how to tweak d2r templates :-) > > Distinct DBpedia albums - 22426 > Distinct DBpedia artists - 39877 > Distinct MySpace artists (on http://dbtune.org/myspace/) - 14668 > Distinct DBpedia countries - 245 > Distinct Lingvoj languages - 185 > > Cheers! > y > > > >> Kingsley >> >>> Cheers! >>> y >>> >>> >>> >>> >>>> 2. cross-link visual derived from the LOD cloud diagram. >>>> >>>> The Linked Movies Database has nice examples of both [1]. >>>> >>>> Links: >>>> >>>> 1. http://www.linkedmdb.org:8080/Main/Interlinking >>>> >>>> -- >>>> >>>> >>>> Regards, >>>> >>>> Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen >>>> President & CEO OpenLink Software Web: http://www.openlinksw.com >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>> >> -- >> >> >> Regards, >> >> Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen >> President & CEO OpenLink Software Web: http://www.openlinksw.com >> >> >> >> >> >> > > -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President & CEO OpenLink Software Web: http://www.openlinksw.com
Received on Saturday, 2 August 2008 17:22:48 UTC