Re: Dataset vocabularies vs. interchange vocabularies (was: Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase)

On Wednesday 26 November 2008, John Graybeal wrote:
> Do you think the argument is mostly settled, or would you agree that
>   duplicating a massive set of URIs for 'local technical
> simplification' is a bad practice? (In which case, is the question
> just a matter of scale?)

I'm a bit late to the discussion, but I feel that this is a question 
that should be dealt with on a case-by-case basis. It is important that 
if you state that two things are owl:sameAs (or some slightly weaker 
statement), it is important that the two things are in fact the same. 
When publishing larger data sets, it is hard to say with sufficient 
certainty that this is the case. Thus, I feel that the best practice is 
to create new URIs for each thing. Stating that things are the same 
should be left to a separate process. 

One should be aware of the extra complexity that is caused by this; you 
need an extra triple in your SPARQL query, which can also reduce query 
engine performance.

If you are building applications based on linked data rather than 
publishing large data sets, I feel it is better to reuse URIs rather 
than create your own, if you plan to publish your URIs at some point. 

In some cases, you may not use a lot of concepts and in every case a 
human is involved, thus you know that what is meant by the concept 
identified by a given URI. In other cases, you use somebody else's URI 
and say that "whatever they mean by this concept, I mean too". This 
covers most of the cases I think most users of linked data will meet.  


Kjetil
-- 
Kjetil Kjernsmo
Programmer / Astrophysicist / Ski-orienteer / Orienteer / Mountaineer
kjetil@kjernsmo.net
Homepage: http://www.kjetil.kjernsmo.net/     OpenPGP KeyID: 6A6A0BBC

Received on Friday, 28 November 2008 12:22:29 UTC