- From: Yves Raimond <yves.raimond@gmail.com>
- Date: Sat, 2 Aug 2008 18:18:34 +0100
- To: "Kingsley Idehen" <kidehen@openlinksw.com>
- Cc: public-lod@w3.org
On Sat, Aug 2, 2008 at 5:17 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote: > Yves Raimond wrote: >> >> Hello! >> >> >>> >>> I would like to suggest that publishers of new linked data spaces that >>> plug >>> into the growing LOD include the following: >>> >>> 1. cross-link information >>> >> >> I would also suggest we find a better measure for interlinkage than a >> raw number of triples linking one dataset to another. >> For example, http://dbtune.org/musicbrainz/ creates its own identifier >> for languages (http://dbtune.org/musicbrainz/directory/language), >> which are owl:sameAs'ed to the corresponding languages in Lingvoj when >> applicable, whereas linkedmdb directly links to the Lingvoj >> identifiers. In the latter case, the raw number of interlinks will be >> higher, but could be reduced a lot by creating identifiers for >> language and use sameAs. >> >> The same applies for geographic locations, for example. Some datasets >> use foaf:based_near to link to Geonames, some others create their own >> identifiers, and then link to the corresponding Geonames locations >> through owl:sameAs. For the same dataset, this two methodologies will >> lead to completely different numbers. >> >> To boost the statistics of a dataset, we could simply link each person >> or group in them to http://dbpedia.org/class/yago/Entity100001740 >> through rdf:type :-D >> > > Amen! > > And it also means we start to expose the fact that LOD is not an "instance > level only" linked data space (a sad misconception). > >> So I think we should agree on what we count as "interlinks" before >> publishing such statistics, so that we can actually use these values? >> > > We should basically express linkages across instance and schema/data > dictionary vectors. This also helps those looking to build LOD applications. > > Of course there is more to come re. the injection of "data dictionary / > schema" linkage aspects of LOD, but no harm in getting our thoughts in order > re. "best practices" for the growing cloud :-) >> >> My recommendation would be to always go for the lowest value - the one >> you'd obtain by creating your own identifiers and using owl:sameAs >> (which would be equivalent to the number of distinct external URIs >> mentioned in your dataset). >> >> What do you think? >> > > Good Idea, so share you page as a nice example :-) > I just gave it a shot on Jamendo, counting the results of a SELECT DISTINCT query, and this is indeed a bit depressing. http://dbtune.org/jamendo/ For example, the Geonames interlinking drops from 3244 to 289 :-) Some similar statistics from Musicbrainz at http://dbtune.org/musicbrainz/ , which I'll publish when I get some time to figure out how to tweak d2r templates :-) Distinct DBpedia albums - 22426 Distinct DBpedia artists - 39877 Distinct MySpace artists (on http://dbtune.org/myspace/) - 14668 Distinct DBpedia countries - 245 Distinct Lingvoj languages - 185 Cheers! y > > Kingsley >> >> Cheers! >> y >> >> >> >>> >>> 2. cross-link visual derived from the LOD cloud diagram. >>> >>> The Linked Movies Database has nice examples of both [1]. >>> >>> Links: >>> >>> 1. http://www.linkedmdb.org:8080/Main/Interlinking >>> >>> -- >>> >>> >>> Regards, >>> >>> Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen >>> President & CEO OpenLink Software Web: http://www.openlinksw.com >>> >>> >>> >>> >>> >>> >>> >> >> > > > -- > > > Regards, > > Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen > President & CEO OpenLink Software Web: http://www.openlinksw.com > > > > >
Received on Saturday, 2 August 2008 17:19:11 UTC