W3C home > Mailing lists > Public > public-prov-wg@w3.org > June 2012

Re: Contextualization ---> Optional bundle in Specialization

From: Graham Klyne <graham.klyne@zoo.ox.ac.uk>
Date: Thu, 28 Jun 2012 13:28:18 +0100
Message-ID: <4FEC4DE2.1070608@zoo.ox.ac.uk>
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
CC: Provenance Working Group WG <public-prov-wg@w3.org>
On 28/06/2012 09:48, Luc Moreau wrote:
> Hi Graham,
> If provenance had been written as below, we wouldn't need
> contextualization for this rating example. ex:Bob would be the
> general entity, and tool:Bob-2011-11-16 and tool:Bob-2011-11-17 its
> specializations, for each activity involvement on different days.
> bundle ex:run1
> activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:00:00) //duration: 1hour
> specializationOf(tool:Bob-2011-11-16, ex:Bob)
> wasAssociatedWith(ex:a1,*tool:Bob-2011-11-16*,[prov:role="controller"])
> endBundle
> bundle ex:run2
> activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:00:00) //duration: 7hours
> specializationOf(tool:Bob-2011-11-17, ex:Bob)
> wasAssociatedWith(ex:a2,*tool:Bob-2011-11-16*,[prov:role="controller"])
> endBundle

I think this is fine, and as you say doesn't require contextualization.  The 
difference is that you are explicitly associating the different entities with 
something that is already in the domain of discourse, namely the activities.

> It's because we allow identifiers to be reused, and we allow
> provenance to be "messy", that the following is accepted in PROV.
> bundle ex:run1
> activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:00:00) //duration: 1hour
> wasAssociatedWith(ex:a1,*ex:Bob*,[prov:role="controller"])
> endBundle
> bundle ex:run2
> activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:00:00) //duration: 7hours
> wasAssociatedWith(ex:a2,*ex:Bob*,[prov:role="controller"])
> endBundle

> If provenance is to be used online by applications to make decisions,
> I don't understand what the problem is with the following
> specializationOf(tool:Bob-2011-11-16, ex:Bob, ex:run1) // or
> contextualizationOf, or whatever name we want
> given that it could have been written in the first place if provenance
> had been more proper.

I think the same effect as the original example can be achieved from your 
"scruffy" case without contextualizationOf, thus:

bundle ex:tooleval

   specializationOf(tool:Bob-2011-11-16, ex:Bob)

   specializationOf(tool:Bob-2011-11-17, ex:Bob)


This preserves the distinction between the two specializations that was present 
in the original, and still does not use contextualization.

The problem is trying to use the bundle to distinguish between the instances. 
That requires introducing a notion of context into the domain of discourse, 
which is where we start running into problems with RDF-compatible semantics.

I think we can get from your "scruffy" example to the above with something like 
a "lifting rule" (cf. Guha thesis) which would be applied *outside* the RDF 
semantics, to yield the expression which captures the intent within the confines 
of RDF semantics.  The contextualization and interpretation of "scruffy" 
provenance has always been outside the scope of the formal provenance semantics 
(articulated as constraints or OWL or other means) - that was the point: 
scruffy -> "here is what provenance data looks like";  constrained: "if 
provenance obeys these rules, you can reason with it in these ways".  The detail 
of how one gets from the scruffy to the formal is not something we've discussed 
(which I think is not unreasonable, though it would be nice to have a way to 
claim that a given bundle conforms the the constraints and formal theory, but 
that's another issue).

If the original non-scruffy example really addresses your use-case, then I claim 
we don't need contextualization and we can drop it and be done.

> Your comment that tool:Bob-2011-11-16 cannot be distinguished from
> tool:Bob-2011-11-17 would also
> apply to the more proper example.

I disagree: see above ("you are explicitly associating the different entities 
with something that is already in the domain of discourse").


> Regards,
> Luc
> On 06/27/2012 06:09 PM, Graham Klyne wrote:
>> On 27/06/2012 10:49, Luc Moreau wrote:
>> > All,
>> >
>> > At the face to face meeting, we have agreed to rename contextualization and
>> mark
>> > this feature
>> > at risk. Tim, Stephan, Paul and I have worked a solution that we now share with
>> > the working group.
>> I'm afraid I still have a problem with this.
>> Considering your bundle tool:analysis01:
>> [[
>> bundle tool:analysis01
>> agent(tool:Bob-2011-11-16, [perf:rating="good"])
>> specializationOf(tool:Bob-2011-11-16, ex:Bob, ex:run1)
>> agent(tool:Bob-2011-11-17, [perf:rating="bad"])
>> specializationOf(tool:Bob-2011-11-17, ex:Bob, ex:run2)
>> endBundle
>> ]]
>> The problem is that, if subject to RDF semantics for URI interpretation, I can
>> see no semantic distinction is possible between
>> tool:Bob-2011-11-16
>> and
>> tool:Bob-2011-11-17
>> I.e. they are both specializations of ex:Bob, and that is all we can know
>> about them, as (by the nature of the semantics of URI interpretation) the
>> denotation of ex:Bob that appears in ex:run1 is the same as the denotation of
>> ex:Bob that appears in ex:run2.
>> ...
>> I do, however, have a different compromise that provides a hook for
>> introducing possible semantics later, or in private implementations, without
>> sneaking in something that could well turn out to be incompatible with, or
>> just different than, what the RDF group may do for semantics of datasets.
>> The hook is this: simply allow attributes for the specializationOf relation,
>> but don't define a specific attribute for bundle. This would allow you to do a
>> private implementation of the scheme you describe, but would not allow it to
>> be mistaken for something that has standardized semantics. As in:
>> specializationOf(tool:Bob-2011-11-17, ex:Bob,
>> [myprivateattribute:bundle=ex:run2])
>> ...
>> In case you think I'm jumping at shadows here, I'll note that RDF has been
>> here before. The original 1999 RDF specification described reification without
>> formal semantics. Reification was intended to allow for capturing this kind of
>> information - i.e. to make assertions about context of use, etc - a kind of
>> proto-provenance, if you like. But when the group came to define a formal
>> semantics for RDF, there were two possible, reasonable and semantically
>> incompatible approaches; looking at the way that reification was being used
>> "in the wild", it turned out that there was data out there that corresponded
>> to both of these (incompatible) approaches. This was in the very early days of
>> the semantic web, so the harm done was quite limited. I think a similar
>> mistake today would cause much greater harm.
>> I think the appropriate way forward is to take this tool performance analysis
>> use-case to the RDF-PROV coordination group, and ask that it be considered as
>> input when defining semantics for RDF datasets. I would expect that whatever
>> semantic structure they choose, it should be able to accommodate the use-case.
>> Then, we should be better placed to create an appropriate and compatible
>> contextualization semantics for provenance bundles. But until then, I think we
>> invite problems by trying to create a standardized data model structure
>> without standardized RDF-compatible semantics to accommodate this use-case.
>> #g
>> --
>> Tracker, this is ISSUE-385
>> On 27/06/2012 10:49, Luc Moreau wrote:
>>> All,
>>> At the face to face meeting, we have agreed to rename contextualization and mark
>>> this feature
>>> at risk. Tim, Stephan, Paul and I have worked a solution that we now share with
>>> the working group.
>>> Given that contextualization was already defined as a kind of specialization, we
>>> now allow an optional
>>> bundle argument in the specialization relation. (Hence, no need to create a new
>>> concept!)
>>> See section 5.5.1 in the current Editor's draft
>>> http://dvcs.w3.org/hg/prov/raw-file/default/model/prov-dm.html#term-specialization
>>> Feedback welcome.
>>> Regards,
>>> Luc
>>> PS. Tracker, this is ISSUE-385
Received on Thursday, 28 June 2012 12:29:17 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:16 UTC