- From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
- Date: Thu, 31 Jul 2014 22:01:03 +0300
- To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
- Cc: Bernard Vatant <bernard.vatant@mondeca.com>, "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
- Message-ID: <CA+u4+a2dfSPG5XQc2RaL7uZ5rXopB8unx67dJ30ZUunX8gsKFg@mail.gmail.com>
On Thu, Jul 31, 2014 at 7:23 PM, Peter F. Patel-Schneider < pfpschneider@gmail.com> wrote: > I would say instead that messy data is better validated in a setting where > consequences have been made explicit and thus that DBpedia is an argument > in > the other direction. > > Consider, for example, Lambeau Field in DBpedia. In reality Lambeau Field > is a sports facility. It probably the most famous NFL football stadium and > is where the Green Bay Packers play. However, the infobox for the Green > Bay Packers was set up to look a bit fancier that it would normally be, and > one of the effects is that Lambeau Field is extracted as the city of the > Green Bay Packers. > > So what to do? If do not include ontological consequences, we have > > <http://dbpedia.org/resource/Lambeau_Field> > <http://dbpedia.org/ontology/location> > <http://dbpedia.org/resource/Green_Bay,_Wisconsin> . > <http://dbpedia.org/resource/Lambeau_Field> > <http://dbpedia.org/ontology/tenant> > <http://dbpedia.org/resource/Green_Bay_Packers> . > <http://dbpedia.org/resource/Lambeau_Field> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > <http://dbpedia.org/ontology/Stadium> . > <http://dbpedia.org/resource/Green_Bay_Packers> > <http://dbpedia.org/ontology/city> > <http://dbpedia.org/resource/Lambeau_Field> . > > Everything may look OK, and, if we don't have any validation criteria for > the city of a sports team (which I consider a very likely situation) we > won't get any indication that there is something wrong. > > Consider, however, if we do include ontological consequences. Then we have > as well > > <http://dbpedia.org/resource/Lambeau_Field> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > <http://dbpedia.org/ontology/SportFacility> . > <http://dbpedia.org/resource/Lambeau_Field> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > <http://dbpedia.org/ontology/Place> . > <http://dbpedia.org/resource/Lambeau_Field> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > <http://dbpedia.org/ontology/City> . > <http://dbpedia.org/resource/Lambeau_Field> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > <http://dbpedia.org/ontology/Settlement> . > <http://dbpedia.org/resource/Lambeau_Field> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> > <http://dbpedia.org/ontology/PopulatedPlace> . > > If we have a validation criteria for cities then it is very likely that > Lambeau Field fails one or more of them. An obvious one to fail would be > that SportFacility and PopulatedPlace are disjoint. > I am in no way against reasoning, I am just against being the norm. I am sure we can find many examples that each approach fits better But without reasoning things can work very well as well. For instance dbo:city has dbo:City as range and dbr:Lambeau_Field is a dbo:Settlement so just checking the range of dbo:city would be sufficient. The exact same option would be sufficient with Harry_Froboess, reasoning would either hide the error or other constraints might re-reveal it. IMHO there is no best way to deal with the domain / range issue. My approach is to report errors when it is different than expected and warnings when it is missing Dimitris > > > peter > > > On 07/30/2014 11:08 PM, Dimitris Kontokostas wrote: > >> >> >> >> On Wed, Jul 30, 2014 at 8:18 PM, Peter F. Patel-Schneider >> <pfpschneider@gmail.com <mailto:pfpschneider@gmail.com>> wrote: >> >> Indeed. (Well, except that just using FOAF vocabulary might not be >> enough >> to bring in FOAF axioms. Explicit importing - oops, that's not in >> RDF yet >> - is probably a better trigger here.) >> >> I think that RDF validation should be done against the closure of an >> RDF >> graph. I proposed this earlier in >> http://lists.w3.org/Archives/__Public/public-rdf-shapes/__ >> 2014Jul/0189.html <http://lists.w3.org/Archives/Public/public-rdf-shapes/ >> 2014Jul/0189.html> >> >> as an option, but I strongly believe that validating against the RDFS >> closure should be the norm. >> >> >> Maybe I am biased towards my experience with DBpedia and messy data but I >> would vote against this being the norm. >> take http://dbpedia.org/resource/Harry_Froboess for example and look at >> the >> dbo:spouse property (dbr:Switzerland, dbr:Berlin) >> This is of course an error in DBpedia but applying rdfs inference would >> hide >> it and make Switzerland & Berlin Persons. >> >> Dimitris >> >> >> peter >> >> >> >> -- Dimitris Kontokostas Department of Computer Science, University of Leipzig Research Group: http://aksw.org Homepage:http://aksw.org/DimitrisKontokostas
Received on Thursday, 31 July 2014 19:02:02 UTC