W3C home > Mailing lists > Public > public-lod@w3.org > August 2009

Re: AW: [Dbpedia-discussion] Fwd: Your message to Dbpedia-discussion awaits moderator approval

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 11 Aug 2009 09:47:36 -0500
Cc: "'Kingsley Idehen'" <kidehen@openlinksw.com>, "'Kavitha Srinivas'" <ksrinivs@gmail.com>, "'Tim Finin'" <finin@cs.umbc.edu>, "'Anja Jentzsch'" <anja@anjeve.de>, <public-lod@w3.org>, <dbpedia-discussion@lists.sourceforge.net>
Message-Id: <DE09DF19-C812-4F66-9D6A-D2C09F2E4046@ihmc.us>
To: "Chris Bizer" <chris@bizer.de>

On Aug 11, 2009, at 5:45 AM, Chris Bizer wrote:

>
> Hi Kingsley, Pat and all,
>
>> Chris/Anja: I believe this data set was touched on your end, right?
>
> Yes, Anja will fix the file and will send an updated version.

Thanks.

>
> Pat Hayes wrote:
>>> This website should be taken down immediately, before it does  
>>> serious
>>> harm. It is irresponsible to publish such off-the-wall  
>>> equivalentClass
>>> assertions.
>
> Pat: Your comment seems to imply that you see the Semantic Web as  
> something
> consistent that can be broken by individual information providers  
> publishing
> false information. If this is the case, the Semantic Web will never  
> fly!

Agreed, but surely we can expect something better than this. We will  
of course need to have ways (not yet elucidated) of locating the  
sources of inconsistencies and correcting or avoiding them. In the  
meantime, many of us are worrying about how to achieve mutual  
consistency between rival high-level ontologies.

>
> Everything on the Web is a claim by somebody. There are no facts,  
> there is
> no truth, there are only opinions.

Same is true of the Web and of life in general, but still there are  
laws about slander, etc.; and outrageous falsehoods are rebutted or  
corrected (eg look at how Wikipedia is managed); or else their source  
is widely treated as nonsensical, which I hardly think DBpedia wishes  
to be. And also, I think we do have some greater responsibility to  
give our poor dumb inference engines a helping hand, since they have  
no common sense to help them sort out the wheat from the chaff, unlike  
our enlightened human selves.

>
> Semantic Web applications must take this into account and therefore  
> always
> assess data quality and trustworthiness before they do something  
> with the
> data.

In a perfect world, but in practice this isn't possible. There are no  
criteria yet available for making such judgements, or even for  
locating the true source of a discovered inconsistency. About the only  
way to to do it is to judge the veracity of the source; and if one  
cannot trust DBpedia to not say blatant falsehoods, who can you trust?  
And I would draw a distinction between what one might call fact-level  
disagreements (about the population of India, say) and high- or mid- 
level problems, which are much harder to deal with. Introducing  
gratuitous, wildly false, claims into the upper middle levels of a  
class hierarchy is liable to produce a very large number of  
inconsistencies down the line which will be very hard to identify and  
very hard to correct. They may appear as apparent errors in instance  
data, for example.

> If you build applications that brake once somebody publishes false
> information, you are obviously doomed.

Of course, but there are degrees of falsehood. To assert that hundreds  
of dissimilar, mid-level ontological categories are all identical is  
the most egregious kind of falsehood. In fact its not really a  
falsehood: it was simply a mistake. Nobody actually thought these  
classes were equal in extent, not for a second. They just didn't know,  
or perhaps didn't care, what 'equivalentClass' means. Hence my rather  
strongly worded protest. The subtext was: please understand, and pay  
attention to, what the relations in your assertions mean. They are not  
just vague links in a vaguely defined associative network.

But in any case, thanks to the workers for the rapid repair response.

Pat


>
> As I thought this would be generally understood, I'm very surprised  
> by your
> comment.
>
> Cheers,
>
> Chris
>
>
>> -----Ursprüngliche Nachricht-----
>> Von: public-lod-request@w3.org [mailto:public-lod-request@w3.org] Im
> Auftrag
>> von Kingsley Idehen
>> Gesendet: Montag, 10. August 2009 23:29
>> An: Kavitha Srinivas
>> Cc: Tim Finin; Anja Jentzsch; public-lod@w3.org; dbpedia-
>> discussion@lists.sourceforge.net; Chris Bizer
>> Betreff: Re: [Dbpedia-discussion] Fwd: Your message to Dbpedia- 
>> discussion
>> awaits moderator approval
>>
>> Kavitha Srinivas wrote:
>>> I will fix the URIs.. I believe the equivalenceClass assertions were
>>> added in by someone at OpenLink (I just sent the raw file with the
>>> conditional probabilities for each pair of types that were above the
>>> .80 threshold).  So can whoever uploaded the file fix the property  
>>> to
>>> what Tim suggested?
>> Hmm,  I didn't touch the file, neither did anyone else at OpenLink. I
>> just downloaded what was uploaded at:
>> http://wiki.dbpedia.org/Downloads33, any based on my own personal  
>> best
>> practices, put the data in a separate Named Graph :-)
>>
>> Chris/Anja: I believe this data set was touched on your end, right?
>> Please make the fixes in line with the findings from the  
>> conversation on
>> this thread. Once corrected, I or someone else will reload.
>>
>> Kingsley
>>
>>> Thanks!
>>> Kavitha
>>> On Aug 10, 2009, at 5:03 PM, Kingsley Idehen wrote:
>>>
>>>> Kavitha Srinivas wrote:
>>>>> Agree completely -- which is why I sent a base file which had the
>>>>> conditional probabilities, the mapping, and the values to be  
>>>>> able to
>>>>> compute marginals.
>>>>> About the URIs, I should have added in my email that because
>>>>> freebase types are not URIs, and have types such as /people/ 
>>>>> person,
>>>>> we added a base URI: http://freebase.com to the types.  Sorry I
>>>>> missed mentioning that...
>>>>> Kavitha
>>>> Kavitha,
>>>>
>>>> If you apply the proper URIs, and then apply fixes to the mappings
>>>> (from prior suggestions) we are set.  You can send me another dump
>>>> and I will go one step further and put some sample SPARQL queries
>>>> together which demonstrate how we can have many world views on the
>>>> Web of Linked Data without anyone getting hurt in the process :-)
>>>>
>>>> Kingsley
>>>>>
>>>>> On Aug 10, 2009, at 4:42 PM, Tim Finin wrote:
>>>>>
>>>>>> Kavitha Srinivas wrote:
>>>>>>> I understand what you are saying -- but some of this reflects  
>>>>>>> the
>>>>>>> way types are associated with freebase instances.  The types are
>>>>>>> more like 'tags' in the sense that there is no hierarchy, but  
>>>>>>> each
>>>>>>> instance is annotated with multiple types.  So an artist would  
>>>>>>> in
>>>>>>> fact be annotated with person reliably (and probably less
>>>>>>> consistently with /music/artist).  Similar issues with Uyhurs,
>>>>>>> murdered children etc.  The issue is differences in modeling
>>>>>>> granularity as well.  Perhaps a better thing to look at are  
>>>>>>> types
>>>>>>> where the YAGO types map to Wordnet (this is usually at a  
>>>>>>> coarser
>>>>>>> level of granularity).
>>>>>>
>>>>>> One way to approach this problem is to use a framework to mix  
>>>>>> logical
>>>>>> constraints with probabilistic ones.  My colleague Yun Peng has  
>>>>>> been
>>>>>> exploring integrating data backed by OWL ontologies with Bayesian
>>>>>> information,
>>>>>> with applications for ontology mapping.  See [1] for recent  
>>>>>> papers
>>>>>> on this
>>>>>> as well as a recent PhD thesis [2] that I think also may be  
>>>>>> relevant.
>>>>>>
>>>>>> [1]
>>>>>>
>>
> http://ebiquity.umbc.edu/papers/select/search/html/613a353a7b693a303b643a373
> 83
>>
> b693a313b643a303b693a323b733a303a22223b693a333b733a303a22223b693a343b643a303
> b7
>> d/
>>>>>>
>>>>>> [2]
>>>>>> http://ebiquity.umbc.edu/paper/html/id/427/Constraint-Generation-and-
>> Reasoning-in-OWL
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Kingsley Idehen          Weblog:
> http://www.openlinksw.com/blog/~kidehen
>>>> President & CEO OpenLink Software     Web: http:// 
>>>> www.openlinksw.com
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>>
>>
>> Regards,
>>
>> Kingsley Idehen	      Weblog:
> http://www.openlinksw.com/blog/~kidehen
>> President & CEO
>> OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>
>
>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 11 August 2009 14:48:40 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:22 UTC