Re: Resource identity again (was: Re: MARC Codes for Forms of Musical Composition) from Bernard Vatant on 2010-07-09 (public-lld@w3.org from July 2010)

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Fri, 9 Jul 2010 19:11:51 +0200
To: public-lld <public-lld@w3.org>
Message-ID: <AANLkTinTwLqS0Sswb2qI7oCTggPgSkxw5dp355jRfTRB@mail.gmail.com>
Antoine, all

[writing this by bits and pieces today, and since messages keep flowing on
this thread, this is maybe already obsolete, sorry ...]

Really happy to see the importance of this identity issue is more and more
at the center of conversation. For the record, I've made in the past several
proposals to tackle what I've been considering for years to be THE issue,
the most radical proposal being certainly expressed in this post
http://blog.hubjects.com/2006/04/identifying-things-blank-nodes-again.html
Browsing the blog you can find variations on this central idea over years.

Basically, most proposals to tackle this issue have boiled down to use
direct assertions. To express that http://ex1.org/foo and
http://ex2.org/bardenote (and thanks to Antoine to use the accurate
verb here) more or less
exactly the same thing, different vocabularies introduce some predicates to
make declarations such as:

http://ex1.org/foo   p   http://ex2.org/bar

Where p stands for owl:sameAs, rdfs:seeAlso; skos:*Match; umbel:isLike, any
future foaf:whatever ...all those predicates conveying some kind of
co-reference. In fact, even if it's not respected, only owl:sameAs has
defined semantics, the other ones can be interpreted at will by
applications, through any follow-your-nose heuristics. Defining formal
semantics for any of those will not prevent hacking, since as danbri just
reminded us, people don't read the specs anyway. So you can define as many
similarity properties you like, they are bound to be used and abused the
same way owl:sameAs has been. And if you consider that owl:sameAs semantics
are as straightforward as can be, go figure how more subtle definitions will
be hacked.

But there are other ways to explore this issue, not to mention the radical
"blank hub" way suggested in the above blog post, which I know will never be
adopted anyway.

Identity (or same-ness or similarity) can be tackled using operational rules
rather than declarative assertions, and in particular *substitutability
rules*. Two denotations (read: URIs here) have somehow the same referent if
to a certain extent they can be substituted to each other in some, many,
most or all assertions using them.

An owl:sameAs declaration amounts to *absolute substitutabilty *: the URIs
can be sustituted in any assertion.

When substitutability is relative, a *substitutabilty rule* could* *assert
the conditions under which substitutability is valid.
For exemple one could say that ex:author is substitutable to dc:creator if
the subject of the triple is a Book.

Put formally

For all x,y
(x a ex:Book AND x dc:creator  y) => (x  ex:author  y)

This is different, and in fact independent of a declaration such as
ex:author    rdfs:subPropertyOf     dc:creator   because it does not say
anything about the use of those properties outside the Book class.

**Let's take an example discussed at length a few months ago on DBpedia
forum.

ex1:MichelleObama    a    foaf:Person
ex2:MichelleObama    a    skos:Concept

In which context are those URIs substitutable? Certainly not for assertions
using predicates specific to persons (e.g., foaf:mbox) or specific to
concepts (e.g., skos:related) or predicates which would bear different
values for the two resources (like dcterms:date, for example). But they are
substitutable for labeling predicates, such as :

?x    rdfs:label  'Michelle LaVaughn Robinson'  which holds for both.

So there again one could write a substitutabilty rule

For all n
(ex1:MichelleObama  rdfs:label  n)  => (ex2:MichelleObama  rdfs:label  n)

Such rules have several advantages over declarative assertions:

- They do not need extra vocabulary to be defined and (mis)understood
- They have non-ambiguous interpretation
- They are flexible *ad libitum* to cover the whole spectrum of
similarity-sameness flavours.

Have a great week-end

Bernard


2010/7/9 Antoine Isaac <aisaac@few.vu.nl>

> Hi Jeff,  others,
>
>
>  One reason umbel:isLike isn’t broadly used might be because people
>> assume owl:sameAs is named intuitively.
>>
>
>
> Yes, quite probably.
> In fact isLike's name makes me think of a derivation link, or even just a
> general similarity between object. And even if their documentation (which is
> quite well done for that property) says it's not a general similarity
> property, there's still instruction that seems to limit the use of it [1]:
> [It is appropriate to use this property when there is strong belief the two
> resources refer to the same individual with the same identity, but that
> association can not be asserted at the present time with certitude.]
>
>
> The traditional issue with owl:sameAs comes from the situations where we
> know that the resources denote the same thing "in real world", but we don't
> want to bluntly merge the statements about them. Would umbel:isLike solve
> the issue? Reading the documentation, it seems to me that it can only do it
> partially--which does not render it useless in absolute, of course.
>
> In fact isLike's name now seems to me really appropriate to its semantics
> :-) Comparing with the SKOS situation, umbel:isLike would be analogous to
> skos:closeMatch without the skos:exactMatch cases (exactMatch is a
> sub-property of closeMatch, which amounts to closeMatch capturing both exact
> concept similarity and approximate one).
>
> I'm ccing Frédérick Giasson, so that he gets the opportunity to clarify the
> point, or to orient us to the suitable umbel doc with the answer!
>
> A final word: there was some discussion about this at the RDF Next Step
> workshop [2], maybe some light will come from the outcome of its efforts...
>
> Cheers,
>
> Antoine
>
> [1] http://www.umbel.org/technical_documentation.html#vocabulary
> [2]
> http://www.w3.org/2001/sw/wiki/index.php?title=RDF_Core_Work_Items&oldid=1990#Co-reference_vocabulary_as_alternative_to_owl:sameAs
>
> PS: Ross I understand your reluctance to use SKOS mapping properties, btw.
>
>
>
>  *From:* rxs@talisplatform.com [mailto:rxs@talisplatform.com] *On Behalf
>> Of *Ross Singer
>> *Sent:* Thursday, July 08, 2010 1:25 PM
>> *To:* Karen Coyle
>> *Cc:* William Waites; Houghton,Andrew; public-lld; Young,Jeff (OR)
>> *Subject:* Re: [open-bibliography] MARC Codes for Forms of Musical
>> Composition
>>
>> On Thu, Jul 8, 2010 at 12:53 PM, Karen Coyle <kcoyle@kcoyle.net
>> <mailto:kcoyle@kcoyle.net>> wrote:
>>
>>    rdfs:seeAlso seems to be similar to the http "link" -- there's some
>>    relationship, but you don't know what it is. Wouldn't some of this
>>    be solved by having richer relationships?
>>
>> Definitely. The issue more lies in finding a relationship that not only
>> says what you want, but is also common enough that other people (or,
>> really, agents, but there's still a person there somewhere) recognize
>> it. For example, umbel:isLike (like Jeff mentioned, like
>> http://purl.org/NET/marccodes/muscomp/dv#genre uses) hits a fairly sweet
>> spot, I think, as far as saying you think you're talking about the same
>> thing, but nobody really uses umbel, really.
>>
>> And therein lies the rub. If skos:exactMatch/closeMatch didn't infer
>> skos:Concepts on either end or foaf had some equivalency property, no
>> problem. But some relatively obscure vocabulary (with a very difficult
>> to grok general purpose) is going to be a much tougher sell.
>>
>> -Ross.
>>
>>
>
>


-- 
Bernard Vatant
Senior Consultant
Vocabulary & Data Engineering
Tel:       +33 (0) 971 488 459
Mail:     bernard.vatant@mondeca.com
----------------------------------------------------
Mondeca
3, cité Nollez 75018 Paris France
Web:    http://www.mondeca.com
Blog:    http://mondeca.wordpress.com
----------------------------------------------------
Received on Friday, 9 July 2010 17:12:23 UTC