W3C home > Mailing lists > Public > public-prov-wg@w3.org > July 2011

Re: PROV-ISSUE-35: Section 4: How one would know that two BOBs are characterizations of the same entity? [Conceptual Model]

From: Paolo Missier <Paolo.Missier@ncl.ac.uk>
Date: Mon, 25 Jul 2011 13:27:29 +0100
Message-ID: <4E2D6131.60300@ncl.ac.uk>
To: public-prov-wg@w3.org
Khalid, Jim

the issue that lurks behind this discussion is, once again, that of identity in the space of characterized entities (C-entities). 
The draft doc avoids talking about identity and instead mentions /identifiers/ which belong in the model. These identifiers have 
more of a technical than a semantic meaning, i.e., they exist so one can refer to, and link across, different Bobs in the model.

With this, see if I can summarize that we have:

- Khalid suggests to introduce sameEntityAs as an equivalence relation in the C-entities space, and then admit axiomatic assertions 
of the form
(1)  sameEntity(b1,b2)
where b1, b2 are (identifiers of) two Bobs in the model.

- Jim suggests that it should be possible to assert, also axiomatically:
(2)  "Bob b1 refers to entity A",   "Bob b2 refers to entity A"

The main difference is that assertions (2) require us to mention A, which lives in C-Entity space, and so far we have not made any 
provision to do so. (1) has not such requirement.

If you use the Royal Society example http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html#IVP-of for reference, 
this means:

- using (2) I need to be able to say "Royal Society" somewhere in the language
- using (1) I don't, but then I never really know what the BOBs refer to.

To me it boils down to whether we ever need to mention "Royal Society" or we are happy to say "b1, b2 refer to the same C-entity 
which-shall-not-be-named".

Notes:
- if we have (2), then (1) follows.
- (1) is sufficient to reason about IVP-of relations, i.e. using entity resolution algorithms (which, as Jim points out, are outside 
the PIL language).

-Paolo


On 7/21/11 9:11 PM, Jim McCusker wrote:
> In the simple case, if a BOB refers to Entity A (for instance, as a
> URI), and another BOB also refers to Entity A, then the BOBs refer to
> the same Entity.
>
> The complex case, where we try to resolve the entities by examining
> the BOBs closely, I think is outside of the PIL, and can be determined
> by applications using whatever algorithms they think are important.
>
> Jim
>
> On Thu, Jul 21, 2011 at 3:39 PM, Khalid Belhajjame
> <Khalid.Belhajjame@cs.man.ac.uk>  wrote:
>> On 21/07/2011 20:20, Luc Moreau wrote:
>>
>> Hi Khalid,
>> Can you expand on this? What would it help us to achieve?
>>
>> At F2F1, some mentioned "turtle all way down" to refer to the idea that we
>> are not trying
>> to make a distinction between an entity and its state (as we used to say
>> then).
>> This would translate into the fact that we only have characterized entities
>> ...
>>    and are not trying to distinguish an entity from a characterized entity.
>>
>> Can you explain what benefits you see in distinguishing entity from
>> characterized entity?
>>
>> So, does it mean in the example, you would say that e1 is same entity as e2?
>> Potentially, this could be captured by (the very rough) definition of
>> version.
>>
>> Yes, possibly, I actually first thought that "isRevisionOf" can be used, but
>> I think it poses stronger condition that what is needed by "sameEntity".
>>
>> Regarding your question about the benefits. I think, having "sameEntity()"
>> can be used in the definition of IVPof:
>> Specifically, in page 10, it is stated that:
>>
>> "An assertion "B is an IVP of A" holds over the temporal intersection of A
>> and B, only if:
>>
>> if a mapping can be established from an attribute X of B to an attribute Y
>> of A, then the values of A and B must be consistent with that mapping
>> B has some attribute that A does not have"
>>
>> I think, if "sameEntity" exists then it can be used as a third condition, to
>> make sure that A and B refers to the same entity, otherwise one cannot be an
>> IVPof the other.
>>
>> Also, given a BOB bi, a user  may be interested in tracing the history of
>> all the BOBs that were used to derive b1 and that refer to the same entity.
>> In other words, the query here is give me the history of the entity that bi
>> refers to.
>>
>> khalid
>>
>>
>>
>> Luc
>>
>> On 21/07/2011 20:06, Provenance Working Group Issue Tracker wrote:
>>
>> PROV-ISSUE-35: Section 4: How one would know that two BOBs are
>> characterizations of the same entity? [Conceptual Model]
>>
>> http://www.w3.org/2011/prov/track/issues/35
>>
>> Raised by: Khalid Belhajjame
>> On product: Conceptual Model
>>
>>
>> Do we need a mean to specify that two BOB are characterizations of the same
>> entity?
>>
>> In the initial draft, I think that the editors intentionally avoided
>> defining the term "entity" as part of the vocabulary. I don't suggest
>> defining that term, but having a means by which one would know that two Bobs
>> are characterizations, possibly different, of the same entity, e.g., using
>> an assertion like "sameEntity(bob1, bob2)".
>>
>> I think this will be useful, amongst other things, in the definition of
>> IVPof.
>>
>> Khalid
>>
>>
>>
>>
>>
>>
>
>


-- 
-----------  ~oo~  --------------
Paolo Missier - Paolo.Missier@newcastle.ac.uk, pmissier@acm.org
School of Computing Science, Newcastle University,  UK
http://www.cs.ncl.ac.uk/people/Paolo.Missier
Received on Monday, 25 July 2011 12:27:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 13:06:37 GMT