- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Thu, 31 Jul 2014 09:23:56 -0700
- To: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
- CC: Bernard Vatant <bernard.vatant@mondeca.com>, "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
I would say instead that messy data is better validated in a setting where consequences have been made explicit and thus that DBpedia is an argument in the other direction. Consider, for example, Lambeau Field in DBpedia. In reality Lambeau Field is a sports facility. It probably the most famous NFL football stadium and is where the Green Bay Packers play. However, the infobox for the Green Bay Packers was set up to look a bit fancier that it would normally be, and one of the effects is that Lambeau Field is extracted as the city of the Green Bay Packers. So what to do? If do not include ontological consequences, we have <http://dbpedia.org/resource/Lambeau_Field> <http://dbpedia.org/ontology/location> <http://dbpedia.org/resource/Green_Bay,_Wisconsin> . <http://dbpedia.org/resource/Lambeau_Field> <http://dbpedia.org/ontology/tenant> <http://dbpedia.org/resource/Green_Bay_Packers> . <http://dbpedia.org/resource/Lambeau_Field> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Stadium> . <http://dbpedia.org/resource/Green_Bay_Packers> <http://dbpedia.org/ontology/city> <http://dbpedia.org/resource/Lambeau_Field> . Everything may look OK, and, if we don't have any validation criteria for the city of a sports team (which I consider a very likely situation) we won't get any indication that there is something wrong. Consider, however, if we do include ontological consequences. Then we have as well <http://dbpedia.org/resource/Lambeau_Field> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/SportFacility> . <http://dbpedia.org/resource/Lambeau_Field> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Place> . <http://dbpedia.org/resource/Lambeau_Field> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/City> . <http://dbpedia.org/resource/Lambeau_Field> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Settlement> . <http://dbpedia.org/resource/Lambeau_Field> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/PopulatedPlace> . If we have a validation criteria for cities then it is very likely that Lambeau Field fails one or more of them. An obvious one to fail would be that SportFacility and PopulatedPlace are disjoint. peter On 07/30/2014 11:08 PM, Dimitris Kontokostas wrote: > > > > On Wed, Jul 30, 2014 at 8:18 PM, Peter F. Patel-Schneider > <pfpschneider@gmail.com <mailto:pfpschneider@gmail.com>> wrote: > > Indeed. (Well, except that just using FOAF vocabulary might not be enough > to bring in FOAF axioms. Explicit importing - oops, that's not in RDF yet > - is probably a better trigger here.) > > I think that RDF validation should be done against the closure of an RDF > graph. I proposed this earlier in > http://lists.w3.org/Archives/__Public/public-rdf-shapes/__2014Jul/0189.html <http://lists.w3.org/Archives/Public/public-rdf-shapes/2014Jul/0189.html> > as an option, but I strongly believe that validating against the RDFS > closure should be the norm. > > > Maybe I am biased towards my experience with DBpedia and messy data but I > would vote against this being the norm. > take http://dbpedia.org/resource/Harry_Froboess for example and look at the > dbo:spouse property (dbr:Switzerland, dbr:Berlin) > This is of course an error in DBpedia but applying rdfs inference would hide > it and make Switzerland & Berlin Persons. > > Dimitris > > > peter > > >
Received on Thursday, 31 July 2014 16:24:28 UTC