W3C home > Mailing lists > Public > public-rdf-shapes@w3.org > July 2014

Re: Wondering about an example of closed world validation

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Thu, 31 Jul 2014 22:01:03 +0300
Message-ID: <CA+u4+a2dfSPG5XQc2RaL7uZ5rXopB8unx67dJ30ZUunX8gsKFg@mail.gmail.com>
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: Bernard Vatant <bernard.vatant@mondeca.com>, "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
On Thu, Jul 31, 2014 at 7:23 PM, Peter F. Patel-Schneider <
pfpschneider@gmail.com> wrote:

> I would say instead that messy data is better validated in a setting where
> consequences have been made explicit and thus that DBpedia is an argument
> in
> the other direction.
>
> Consider, for example, Lambeau Field in DBpedia.  In reality Lambeau Field
> is a sports facility.  It probably the most famous NFL football stadium and
> is where the Green Bay Packers play.  However, the infobox for the Green
> Bay Packers was set up to look a bit fancier that it would normally be, and
> one of the effects is that Lambeau Field is extracted as the city of the
> Green Bay Packers.
>
> So what to do?  If do not include ontological consequences, we have
>
> <http://dbpedia.org/resource/Lambeau_Field>
>   <http://dbpedia.org/ontology/location>
>     <http://dbpedia.org/resource/Green_Bay,_Wisconsin> .
> <http://dbpedia.org/resource/Lambeau_Field>
>   <http://dbpedia.org/ontology/tenant>
>     <http://dbpedia.org/resource/Green_Bay_Packers> .
> <http://dbpedia.org/resource/Lambeau_Field>
>   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>     <http://dbpedia.org/ontology/Stadium> .
> <http://dbpedia.org/resource/Green_Bay_Packers>
>   <http://dbpedia.org/ontology/city>
>     <http://dbpedia.org/resource/Lambeau_Field> .
>
> Everything may look OK, and, if we don't have any validation criteria for
> the city of a sports team (which I consider a very likely situation) we
> won't get any indication that there is something wrong.
>
> Consider, however, if we do include ontological consequences.  Then we have
> as well
>
> <http://dbpedia.org/resource/Lambeau_Field>
>   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>     <http://dbpedia.org/ontology/SportFacility> .
> <http://dbpedia.org/resource/Lambeau_Field>
>   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>     <http://dbpedia.org/ontology/Place> .
> <http://dbpedia.org/resource/Lambeau_Field>
>   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>     <http://dbpedia.org/ontology/City> .
> <http://dbpedia.org/resource/Lambeau_Field>
>   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>     <http://dbpedia.org/ontology/Settlement> .
> <http://dbpedia.org/resource/Lambeau_Field>
>   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>     <http://dbpedia.org/ontology/PopulatedPlace> .
>
> If we have a validation criteria for cities then it is very likely that
> Lambeau Field fails one or more of them.  An obvious one to fail would be
> that SportFacility and PopulatedPlace are disjoint.
>

I am in no way against reasoning, I am just against being the norm. I am
sure we can find many examples that each approach fits better
But without reasoning things can work very well as well. For instance
 dbo:city has dbo:City as range and dbr:Lambeau_Field is a dbo:Settlement
so just checking the range of dbo:city would be sufficient.
The exact same option would be sufficient with Harry_Froboess, reasoning
would either hide the error or other constraints might re-reveal it.

IMHO there is no best way to deal with the domain / range issue. My
approach is to report errors when it is different than expected and
warnings when it is missing

Dimitris


>
>
> peter
>
>
> On 07/30/2014 11:08 PM, Dimitris Kontokostas wrote:
>
>>
>>
>>
>> On Wed, Jul 30, 2014 at 8:18 PM, Peter F. Patel-Schneider
>> <pfpschneider@gmail.com <mailto:pfpschneider@gmail.com>> wrote:
>>
>>     Indeed.  (Well, except that just using FOAF vocabulary might not be
>> enough
>>     to bring in FOAF axioms.  Explicit importing - oops, that's not in
>> RDF yet
>>     - is probably a better trigger here.)
>>
>>     I think that RDF validation should be done against the closure of an
>> RDF
>>     graph.  I proposed this earlier in
>>     http://lists.w3.org/Archives/__Public/public-rdf-shapes/__
>> 2014Jul/0189.html <http://lists.w3.org/Archives/Public/public-rdf-shapes/
>> 2014Jul/0189.html>
>>
>>     as an option, but I strongly believe that validating against the RDFS
>>     closure should be the norm.
>>
>>
>> Maybe I am biased towards my experience with DBpedia and messy data but I
>> would vote against this being the norm.
>> take http://dbpedia.org/resource/Harry_Froboess for example and look at
>> the
>> dbo:spouse property (dbr:Switzerland, dbr:Berlin)
>> This is of course an error in DBpedia but applying rdfs inference would
>> hide
>> it and make Switzerland & Berlin Persons.
>>
>> Dimitris
>>
>>
>>     peter
>>
>>
>>
>>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Received on Thursday, 31 July 2014 19:02:02 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:02:39 UTC