W3C home > Mailing lists > Public > public-prov-wg@w3.org > June 2012

RE: ISSUE-385: hasProvenanceIn: finding a solution

From: Miles, Simon <simon.miles@kcl.ac.uk>
Date: Wed, 6 Jun 2012 16:29:53 +0100
To: W3C provenance WG <public-prov-wg@w3.org>
Message-ID: <830EEE5C741ED54EAB28EBACFFC77984EE856F5686@KCL-MAIL04.kclad.ds.kcl.ac.uk>
Hello Luc, all,

Sorry, just catching up on all the mails on this topic.

I have a stronger idea of what is being aimed for. In particular, I think the key point I didn't properly account for in replying to the original hasProvenanceIn proposal is that we want this all to work without requiring any reasoning.

I'm glad we seem agreed that there is a need for alternateOf or specializationOf along with any contextualizationOf/hasProvenanceIn assertion.

I have three thoughts about the current contextualizationOf specification.

1. At least the first example for contextualization has many similarities to the specialization example used in the primer, i.e. a coarse-grained entity has been referred to earlier (in one context) then the distinction between two finer grained versions of the entity is required later (another context), and we use specialization to express how the different granularities relate. I think the difference between the relations might be clarified if we said what kind of thing the "Something" and "another" are in the contextualization definition.

2. By referring to a bundle as a context, there is the implication that the statements in a single bundle present a single, consistent context. The current DM does not obviously support this, and it could be restrictive.

To try to answer my own points, could we say something like the following?

Contextualization is a relation between an entity and a bundle, asserting that the entity is described in a somehow consistent context in that bundle. It further refers to a perspective on that entity used within that context, i.e. another entity that this entity is an alternate or specialization of. Not every bundle of PROV descriptions necessarily presents a consistent context.

3. Why are we assuming that there is only one alternate/generalised entity in the bundle referred to? It happens to be true in the examples, but what if ex:run1 also contained:
  alternateOf(ex:Bob, ex:Obo)
Why would the contextualizationOf statement in tool:analysis01 only refer to ex:Bob and not ex:Obo?

I notice a mistake in the first example:
  specialization(tool:ratedBob2, [perf:rating="bad"])

thanks,
Simon

Dr Simon Miles
Senior Lecturer, Department of Informatics
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166

Provenance: The bridge between experiments and data:
http://eprints.dcs.kcl.ac.uk/1372/

________________________________
From: Luc Moreau [l.moreau@ecs.soton.ac.uk]
Sent: 05 June 2012 21:03
To: Timothy Lebo
Cc: W3C provenance WG
Subject: Re: ISSUE-385: hasProvenanceIn: finding a solution

Hi Tim,

I tried to write this up as a separate relation contextualizationOf, see section 1.3 in [1].
I believe this relation is compatible with your rdf encoding. The only difference, here,
is that we make this an identifiable thing.

       [
           a prov:Entity;  prov:ContextualizedEntity;
           prov:identifier       ex:Bob;
           prov:inContext     ex:run2;
       ];

What do you think?
Luc

[1] http://dvcs.w3.org/hg/prov/raw-file/default/model/working-copy/wd6-contextualization.html

On 04/06/2012 23:25, Timothy Lebo wrote:
Luc,

(bottom)

On Jun 4, 2012, at 5:31 PM, Luc Moreau wrote:

Hi Tim,

Some comments/questions below.

On 04/06/2012 13:46, Timothy Lebo wrote:
Luc,

On Jun 4, 2012, at 5:16 AM, Luc Moreau wrote:

Hi all,

During this diamond jubilee WE, I had the opportunity to think about Tim and Simon's long emails.

I agree with them that we have concepts of alternate and specialisation, and we want to reuse them.

I also came to the conclusion that behind the hasProvenanceIn relation, what I really wanted was a form of alternate. But not what Tim or Simon are suggesting.

The PROV data model has a shortcoming: the inability to identify something in some context. That's what I am trying to address here.






The interpretation of
       alternate(tool:Bob2, ex:Bob,ex:run2)
is that tool:Bob2 is the entity that share aspects of ex:bob as described by ex:run2. Conceptually, this could be done by substituting ex:Bob for tool:Bob2 in ex:run2.

I appreciate that what I am describing here is not too distant from http://www.w3.org/TR/2011/WD-prov-dm-20111215/#record-complement-of, which had optional account, and was not received with enthusiasm, to say the least.

Coincidentally, Paul shared this paper
http://ceur-ws.org/Vol-614/owled2010_submission_29.pdf which introduces  rules of the kind
X counts as Y in context C
which bears some resemblance with what I am trying to argue for.

So, my proposal is;
- drop hasProvenanceIn
- drop isTopicIn
- allow for the ternary form of alternate

Tim and Simon approach by using two binary relations do not offer the same level of expressivity.
The also have a technological bias, as well: they require querying/reasoning facility.  Therefore,
their suggestion is not suitable for a data model supposed to be technology neutral.


A stab at:

bundle tool:analysis01
     alternate(tool:Bob2, ex:Bob,ex:run2)
endBundle

in PROV-O:

tool:analysis01 {
    tool:Bob2
       prov:alternateOf [  ## The use here of bnode is, for once, actually appropriate :-)
           a prov:Entity;  prov:ContextualizedEntity;
           prov:identifier       ex:Bob;   ## The identifier that is used "over there"   Can't use dcterms:identifier b/c that is a rdfs:Literal.
           prov:inContext     ex:run2;   ## "over there"       Could prov:atLocation be reused?
       ];
}


Thanks for this, Tim.

First some questions:
- why a bnode here?

bnodes are read "the thing that" and _can_ serve as an existential.

- Can you explain the  dcterms:identifier comment?

1) The value is the identifier used in the other bundle.
2) The rdfs:range of dcterms:identifier is a literal "http://foo.com", but it is more useful if it is a rdfs:Resource <http://foo.com>. With the former, we know that we can "try to go there" to dereference the URI.


Now, assuming that this rdf encoding expresses what was originally suggested, some further questions:
- have we got indeed a ternary alternateOf relation in prov-dm as I suggested?

Perhaps. The original binary that we now know and love, and a second ternary that "wraps" a URI and a Bundle (that mentions the URI).
The only new things would be:

1) The two new predicates prov:identifier and prov:inContext (perhaps that should just be called prov:inBundle -- I was swayed too far towards DCTerms when I chose that this morning).
2) The new rule to unwrap your ternary DM into this RDF structure.


- or have we got some form of ternary relation isContextualizationOf(e2,e1,bundle)?

Or, just a binary isContextualized(e1,bundle)?

And we just stack on an existing alternateOf(e2,e1)...


BTW, not really sure where we're going with this.
It feels like we're close to wrapping this up, but worried that we're in some odd local minima.

Regards,
Tim



Thanks,
Luc
Received on Wednesday, 6 June 2012 15:31:20 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:16 UTC