W3C home > Mailing lists > Public > public-rdf-shapes@w3.org > July 2014

Re: Wondering about an example of closed world validation

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Thu, 31 Jul 2014 09:23:56 -0700
Message-ID: <53DA6D9C.1060703@gmail.com>
To: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
CC: Bernard Vatant <bernard.vatant@mondeca.com>, "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
I would say instead that messy data is better validated in a setting where
consequences have been made explicit and thus that DBpedia is an argument in
the other direction.

Consider, for example, Lambeau Field in DBpedia.  In reality Lambeau Field
is a sports facility.  It probably the most famous NFL football stadium and
is where the Green Bay Packers play.  However, the infobox for the Green
Bay Packers was set up to look a bit fancier that it would normally be, and
one of the effects is that Lambeau Field is extracted as the city of the
Green Bay Packers.

So what to do?  If do not include ontological consequences, we have

<http://dbpedia.org/resource/Lambeau_Field>
   <http://dbpedia.org/ontology/location>
     <http://dbpedia.org/resource/Green_Bay,_Wisconsin> .
<http://dbpedia.org/resource/Lambeau_Field>
   <http://dbpedia.org/ontology/tenant>
     <http://dbpedia.org/resource/Green_Bay_Packers> .
<http://dbpedia.org/resource/Lambeau_Field>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
     <http://dbpedia.org/ontology/Stadium> .
<http://dbpedia.org/resource/Green_Bay_Packers>
   <http://dbpedia.org/ontology/city>
     <http://dbpedia.org/resource/Lambeau_Field> .

Everything may look OK, and, if we don't have any validation criteria for
the city of a sports team (which I consider a very likely situation) we
won't get any indication that there is something wrong.

Consider, however, if we do include ontological consequences.  Then we have
as well

<http://dbpedia.org/resource/Lambeau_Field>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
     <http://dbpedia.org/ontology/SportFacility> .
<http://dbpedia.org/resource/Lambeau_Field>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
     <http://dbpedia.org/ontology/Place> .
<http://dbpedia.org/resource/Lambeau_Field>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
     <http://dbpedia.org/ontology/City> .
<http://dbpedia.org/resource/Lambeau_Field>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
     <http://dbpedia.org/ontology/Settlement> .
<http://dbpedia.org/resource/Lambeau_Field>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
     <http://dbpedia.org/ontology/PopulatedPlace> .

If we have a validation criteria for cities then it is very likely that
Lambeau Field fails one or more of them.  An obvious one to fail would be
that SportFacility and PopulatedPlace are disjoint.


peter

On 07/30/2014 11:08 PM, Dimitris Kontokostas wrote:
>
>
>
> On Wed, Jul 30, 2014 at 8:18 PM, Peter F. Patel-Schneider
> <pfpschneider@gmail.com <mailto:pfpschneider@gmail.com>> wrote:
>
>     Indeed.  (Well, except that just using FOAF vocabulary might not be enough
>     to bring in FOAF axioms.  Explicit importing - oops, that's not in RDF yet
>     - is probably a better trigger here.)
>
>     I think that RDF validation should be done against the closure of an RDF
>     graph.  I proposed this earlier in
>     http://lists.w3.org/Archives/__Public/public-rdf-shapes/__2014Jul/0189.html <http://lists.w3.org/Archives/Public/public-rdf-shapes/2014Jul/0189.html>
>     as an option, but I strongly believe that validating against the RDFS
>     closure should be the norm.
>
>
> Maybe I am biased towards my experience with DBpedia and messy data but I
> would vote against this being the norm.
> take http://dbpedia.org/resource/Harry_Froboess for example and look at the
> dbo:spouse property (dbr:Switzerland, dbr:Berlin)
> This is of course an error in DBpedia but applying rdfs inference would hide
> it and make Switzerland & Berlin Persons.
>
> Dimitris
>
>
>     peter
>
>
>
Received on Thursday, 31 July 2014 16:24:28 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:02:39 UTC