Re: How do you deprecate URIs? Re: OWL-DL and linked data

Richard Cyganiak wrote:
> On 9 Jul 2008, at 00:11, Bijan Parsia wrote:
> [big snip]
>> Complaining that the Big Nasty People Who Know What They're Talking 
>> About are raining on your sameAs parade isn't constructive.
> 
> Ah Bijan. How about *you* grow up, flameboy?

(Please soften your language, both of you. Consider picking up the phone 
instead.)

> You keep asserting that There Are Technical Problems With Using sameAs. 
> It would help your argument if you told us what those technical problems 
> actually *are*. I heard you say that using owl:sameAs could bite us in 
> the butt. Could you be more specific?

The core idea is quite simple, and relates to the notion of when two 
(rdf/owl) documents are describing (typically amongst other things) the 
single same entity, ...the same thing. In cases where it is true to say 
they describe the same entity, the term 'owl:sameAs' is one handy way to 
express that situation. In cases where the two documents describe 
different entities, it is not true to say that owl:sameAs holds between 
them. This is all irrespective of which document (if any) the owl:sameAs 
claims are made in, and purely cast in terms of whether the claim is 
true. And the main thing to remember about OWL here is that if the 
owl:sameAs claim is true, and we believe both of the docs, all 
information about that entity written in both documents gets pooled.

> Many people in this forum, including me, do not have a background in 
> formal logics. Without that background, it is hard to distinguish proper 
> uses of owl:sameAs from improper uses of owl:sameAs. 

This is true, regarding the list. There are people from a great variety 
of backgrounds around here. And on a good day, that is one of our strengths.

> A side note: The reason why I advocate the use of owl:sameAs is not that 
> it's the *right* solution. But it's *the only solution that was 
> available*. The alternative would have been to argue for a year or two 
> instead of linking up our datasets. Not compelling. That being said, I'm 
> very interested in hearing your take on when I should use owl:sameAs and 
> when not.

One metric here might simply be: what % of owl:sameAs claims in the LOD 
scene are false claims.  However, that isn't itself always a bad thing. 
Sometimes publishing false information online has value - for example, 
historical data. Life is a lot easier though if at least the identity 
reasoning we do is based on reliable information. For this reason, 
publishing false identity claims can be a lot more destructive than 
publishing other kinds of falsehood. The LiveJournal RDF/FOAF dataset 
for example might be full of 10s of 1000s of fake birthdate properties. 
We kinda expect that. And we should also expect to see a rise in spam 
blogs making false identity claims too about their owners. Dealing with 
the latter is a bigger pain though. For datasets that come from 
relatively trusted sources, it is a big win if we can believe the 
identity-related claims they make.

If the best data / tools you have suggest that two docs/datasets are 
describing the selfsame entity, using owl:sameAs seems fine, even if you 
have a secret hunch you're only perhaps 95% confident of the data 
quality or tool reliability. If the best information you have instead is 
telling you "these two documents seem to be talking about more or less 
the same notion", then owl:sameAs probably isn't for you: it doesn't 
communicate what you know. Which of these situations you're in might be 
something of a judgement call, but it should be a judgement call 
grounded in clarity about what a use of owl:sameAs is claiming.

I doubt we can get very far with this in the absence of examples. Would 
anyone like to collect up a dozen various owl:sameAs claims published 
explicitly in the Web that might be considered questionable? (for now 
let's set aside cases where owl:sameAs is implied by other constructs).

cheers,

Dan

--
http://danbri.org/

Received on Wednesday, 9 July 2008 10:21:02 UTC