W3C home > Mailing lists > Public > public-lod@w3.org > August 2009

Re: AW: [Dbpedia-discussion] Fwd: Your message to Dbpedia-discussion awaits moderator approval

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 11 Aug 2009 22:23:35 -0400
Message-ID: <4A8227A7.8000809@openlinksw.com>
To: Hugh Glaser <hg@ecs.soton.ac.uk>
CC: Pat Hayes <phayes@ihmc.us>, Chris Bizer <chris@bizer.de>, Kavitha Srinivas <ksrinivs@gmail.com>, Tim Finin <finin@cs.umbc.edu>, Anja Jentzsch <anja@anjeve.de>, "public-lod@w3.org" <public-lod@w3.org>, "dbpedia-discussion@lists.sourceforge.net" <dbpedia-discussion@lists.sourceforge.net>
Hugh Glaser wrote:
> Hi Kingsley.
>
> On 12/08/2009 00:28, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>
>   
>> Hugh Glaser wrote:
>>     
>>> On 11/08/2009 15:47, "Pat Hayes" <phayes@ihmc.us> wrote:
>>>
>>>  
>>>       
>>>> On Aug 11, 2009, at 5:45 AM, Chris Bizer wrote:
>>>>
>>>>    
>>>>         
>>>>> Hi Kingsley, Pat and all,
>>>>>
>>>>>      
>>>>>           
>>> <snip/>
>>>  
>>>       
>>>>> Everything on the Web is a claim by somebody. There are no facts,
>>>>> there is
>>>>> no truth, there are only opinions.
>>>>>      
>>>>>           
>>>> Same is true of the Web and of life in general, but still there are
>>>> laws about slander, etc.; and outrageous falsehoods are rebutted or
>>>> corrected (eg look at how Wikipedia is managed); or else their source
>>>> is widely treated as nonsensical, which I hardly think DBpedia wishes
>>>> to be. And also, I think we do have some greater responsibility to
>>>> give our poor dumb inference engines a helping hand, since they have
>>>> no common sense to help them sort out the wheat from the chaff, unlike
>>>> our enlightened human selves.
>>>>
>>>>    
>>>>         
>>>>> Semantic Web applications must take this into account and therefore
>>>>> always
>>>>> assess data quality and trustworthiness before they do something
>>>>> with the
>>>>> data.
>>>>>      
>>>>>           
>>> I think that this discussion really emphasises how bad it is to put this
>>> co-ref data in the same store as the other data.
>>>  
>>>       
>> Yes, they should be in distinct Named Graphs.
>>     
> I thought you would mention Named Graphs :-)
>   
>> This is the point I was making a while back (in relation to Alan's
>> comments about the same thing).
>>     
> Yes, but this is the point I was making a while back about Named Graphs as a
> solution - when I resolve a URI (follow-my-nose) in the recommended fashion,
> I see no Named Graphs - they are only exposed in SPARQL stores.
> If I resolve http://dbpedia.org/resource/London to get
> http://dbpedia.org/data/London.rdf I see a bunch of RDF - go on, try it. No
> sight of Named Graphs.
>   
Correct, but the publisher of the Linked Data is putting HTTP URIs in 
front of the content of a Quad Store. These URIs are associated with 
SPARQL queries (in the case of DBpedia).With regards to the great 
example from yesterday, I deliberately put out two different views to 
demonstrate that you can partition data and not break the graph 
traversal desired by the follow-your-nose data exploration and discovery 
pattern. But note, and this is very important, the follow-your-nose 
pattern doesn't eradicate the fact that  cul-de-sacs and T-junctions 
will also be part the Web of Linked Data.

> Are you saying that the only way to access Linked Data is via SPARQL?
>   
>>> Finding data in dbpedia that is mistaken/wrong/debateable undermines the
>>> whole project - the contract dbpedia offers is to reflect the wikipedia
>>> content that it offers.
>>>  
>>>       
>> Er. its prime contract is a Name Corpus. In due course there will be
>> lots of meshes from other domains Linked Data contributors e.g. BBC,
>> Reuters, New York Times etc..
>>     
> I really don't think so.
>   
In my world view "contract" doesn't imply sole use or potential :-)
> Its prime contract is that I can resolve a URI for a NIR and get back things
> like Description, Location, etc..
>   
I've written enough about HTTP URIs and their virtues [1].

Hopefully, we will forget the horrible term: NIR, really.  It just about 
data items, their identifiers, and associated metadata.

> If it gives me dodgy other stuff that I can't distinguish, I will have to
> stop using it, which would be a disaster.
>   
>> The goal of DBpedia was to set the ball rolling and in that regard its
>> over achieved (albeit from my very biased view point).
>>     
> Oh yes! - but let's not let it get spoilt.
>   
I really believe you are overreacting here. Ironically, you seem to have 
missed the trivial manner in which this data set was erased without any 
effect on DBpedia URIs whatsoever. Even at the time the data was loaded, 
you wouldn't have been able to de-reference this data from DBpedia URIs 
(back to the Named Graph issue above and follow-your-nose) since the 
SPARQL that generates the metadata for DBpedia's HTTP URIs is explicitly 
scoped to Graph IRI: <http://dbpedia.org> .

Remember, this linkset was basically a set of axioms that could have 
been used solely for backward chained reasoning via SPARQL pragmas. Said 
SPARQL could even be used as basis for  a different set of HTTP URIs 
that point to the DBpedia ones (without explicit inverse triples in the 
DBpedia graph and the link property doesn't have to one that's symmetrical).
>> Perfection is not an option on the Web or in the real world. We exist in
>> a continuum that is inherently buggy, by design (otherwise it would be
>> very boring).
>>     
> When we engineer things we accept all that - but what we then do is engineer
> systems so that they are robust to the imperfections.
>   
Sure re. robustness, but ironically you don't quite see the robustness 
and dexterity this whole episode has unveiled re. community discourse 
and rapid resolution etc..
We would have had a little problem if the data had been loaded into the 
DBpedia Named Graph. Basically, the inconvenience would have come down 
to the  SPARUL based Deletion duration; especially as I wouldn't have 
been able to simply Drop a Named Graph from the Quad Store.

>>> And it isn't really sensible/possible to distinguish the extra sameas from
>>> the "real" sameas.
>>> Eg http://dbpedia.org/resource/London and
>>> http://dbpedia.org/resource/Leondeon
>>>       
> Sorry, I was wrong about these two being sameAs - they are dbpprop:redirect,
> although I don't think that it changes the story.
> Actually, in fact dbpprop:redirect may be a sub-property of owl:sameAs for
> all I know.
> (I think the URIs for http://dbpedia.org/property/ and
> http://dbpedia.org/ontology/ need fixing :-) )
> I had inferred they were sameAs, since they sameAs yago or fbase stuff,
> which then get sameAs elsewhere.
>
>   
>>> And on the other hand, freebase is now in danger of being undermined by this
>>> as well.
>>>
>>> As time goes by, the more I think this is going wrong.
>>>  
>>>       
>> I think the complete opposite.
>>
>> We just need the traditional media players to comprehend that: Data is
>> like Wine and Code is like Fish. Once understood, they will realize that
>> the Web has simply introduced a "medium of value exchange" inflection
>> i.e., the HTTP URI as opposed to URL (which replicates paper). Note,
>> every media company is high quality Linked Data Space curator in
>> disguise, they just need to understand what the Web really offers :-)
>>     
> By "this", I meant putting the co-reffing (sameAs) links in the RDF that is
> returned with the data about the NIR when a URI is resolved.
>   
These links should be partitioned in a manner that delivers some degree 
of provenance. There is no other practical solution (that I know of) for 
a World Wide Web of opinions (or claims) exposed by a Web of Linked Data 
Spaces.  I am a firm believer in the dexterity of the human brain and 
the use of computers to crunch and filter data, en route to  making us 
more productive :-)

Once we have a new linkset, I'll make some demos to substantiate my 
position re. Named Graphs, Linked Data, and the Follow-your-nose pattern.

Links:

1. http://bit.ly/eBLv1  - Post about URNs, URLs, and the Linked Data 
Meme's HTTP URI


Kingsley


>>     
>>> Best
>>> Hugh
>>> <truncate/>
>>>
>>>
>>>  
>>>       
>> --
>>
>>
>> Regards,
>>     
> Cheers
> Hugh
>   
>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>> President & CEO
>> OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>>
>>
>>
>>     
>
>
>   


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Wednesday, 12 August 2009 02:24:20 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:23 UTC