Re: PROV-ISSUE-186: Section 5.2.1 (PROV-DM as on Nov 28) [prov-dm]

Hi Satya,

This discussion about identifier is now the remit of PROV-ISSUE-183.  
Once PROV-ISSUE-183 is resolved,
I think this one is also resolved.

Further responses interleaved.

On 12/07/2011 01:50 AM, Provenance Working Group Issue Tracker wrote:
> PROV-ISSUE-186: Section 5.2.1 (PROV-DM as on Nov 28) [prov-dm]
>
> http://www.w3.org/2011/prov/track/issues/186
>
> Raised by: Satya Sahoo
> On product: prov-dm
>
> Hi,
> The following are my comments on Section 5.2.1 of the PROV-DM as on Nov 28:
>
> Section 5.2.1:
> 1. "entity record is a representation of an entity."
>
> Comment: So, we make provenance assertions about the entity or the entity record? How is a provenance assertion about the entity differentiated from an entity record?
>    

The entity is the thing and its situation in the world, the entity 
record is what we hold in a provenance record.
We're making assertions about entities.
> For example, is there is a difference between:
> a) entity(e0, [ prov:type="File", ex:path="/shared/crime.txt", ex:creator="Alice" ])
> and
> b) e0 has size 10KB on disk - this assertion clearly does not mean that the entity record "entity(e0, [ prov:type="File", ex:path="/shared/crime.txt", ex:creator="Alice" ])" has size 10KB! The entity record, with about 80 characters, may have size 1KB on disk.
> e0 is a representation of the entity (located at /shared/crime.txt and created by Alice). In any knowledge representation approach and in information systems, we always work with representation of the real world thing and refer to these representations by an identifier. Clearly entity and its records are two distinct information resources. How is fusing entity and its record into single identifier relevant for modeling provenance of entities?
>
> 2. "id: an identifier id identifying an entity; the identifier of the entity record is defined to be the same as the identifier of the entity; "
>    

> Comment: If the id of entity and entity record are the same, then how can two distinct set of assertions about the same entity exist?
> If we use wasComplementOf Approach: We will create a new identifier everytime we want to make an assertion about the same entity?
> E.g. Harvard University was established in the 17th century.
>       Harvard University was established in the year 1636.
> will require two distinct identifiers for Harvard University?
> Using wasComplementOf does not solve the problem since if there are 100,000 assertions about Harvard University we will end creating 100,000 identifiers and will have to link them together using 100,000 wasComplementOf properties. This is clearly an overly complicated modeling approach. More importantly, this goes against the Web architecture approach of re-use identifiers instead of minting new ones (in this case clearly avoidable):
> > From the AWWW [1] :
> a. Good practice: Avoiding URI aliases - "A URI owner SHOULD NOT associate arbitrarily different URIs with the same resource."
>
> [1] http://www.w3.org/TR/2004/REC-webarch-20041215/#uri-aliases
>    

I don't think we do, do we?
> 3. "If an asserter wishes to characterize an entity with the same attribute-value pairs over several intervals, then they are required to assert multiple entity records, each with its own identifier (so as to allow potential dependencies between the various entity records to be expressed)."
>
> Comment: If the entity has to be characterized with different attribute-value pairs over same intervals, do they create distinct identifiers?
>    

An example illustrating this case is covered in section 8 of the document.

Luc
> Thanks.
>
> Best,
> Satya
>
>
>
>    

-- 
Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm

Received on Wednesday, 7 December 2011 11:09:17 UTC