Re: PROV-O telcon from Stian Soiland-Reyes on 2011-11-23 (public-prov-wg@w3.org from November 2011)

From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Date: Wed, 23 Nov 2011 10:00:34 +0000
To: Timothy Lebo <lebot@rpi.edu>
Cc: Luc Moreau <L.Moreau@ecs.soton.ac.uk>, Satya Sahoo <satya.sahoo@case.edu>, Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>, Daniel Garijo <dgarijov@gmail.com>, James Cheney <jcheney@inf.ed.ac.uk>, "Deborah L. McGuinness" <dlm@cs.rpi.edu>, Paolo Missier <Paolo.Missier@ncl.ac.uk>, Provenance Working Group WG <public-prov-wg@w3.org>
Message-ID: <CAPRnXtm24AV7BdRsYXokSo6iu98-uKjjuLM272X647DD4_uW2w@mail.gmail.com>

On Tue, Nov 22, 2011 at 15:24, Timothy Lebo <lebot@rpi.edu> wrote:

> Don't tell anybody, but prov:entity (now called prov:qualifiedEntity) is used just like rdf:object.
> So, the pattern that QualifiedInvolvements (shhhhhh; an rdf:Statement) uses would make prov:qualifiedEntity point to
the earlier Entity.

Yes, but with a stronger interpretation. Making reified statements
does not normally also entail those statements - but we would ideally
want our QualifiedInvolvements to also entail the "basic" involvement
like prov:used.

> I'm still chewing on Paul's example, which was left open [1]. I'm trying to reconcile the continuum of elaboration while letting them meet in the middle and avoiding proliferation of constructs.

Associating time with derivation is a bit unclear - it would kind of
imply a (potential sub-) process execution with usage and generation
and would typically be a time:Interval, not time:Instant. (unless the
entities are used at the same time as the derived entity was
generated).

As an entity can be derived from several entities, and each of these
"derivation events" have different times associated with it, it
becomes a bit tricky to understand what it means.

If (imagined ASN):

wasDerivedFrom(A, B, start=t1, end=t3)
wasDerivedFrom(A, C, start=t2, end-t4)

Then what does it tell us? That A "was made derived" from B between t1
and t3, and from C between t2 and t4? But we've agreed that A can only
exist once that particular characterisation is complete and immutable.
So if B contributes attrib2 to A, and C attrib3, then we can't have
the A entity until both attributes have been provided.

The above perhaps implies:

wasGeneratedBy(A, x, t3)
wasGeneratedBy(A, y, t4)
# hence t3==t4 and x==y
used(x, B, t1)
used(y, C, t2)

But is this useful? It still reads wrong,

wasGeneratedBy(x, y, 18:15, 18:20) -->
> "x was derived from y from 18:15 to 18:20"

but that's not true. X continued to be derived from Y after 18:20.

>>    Why is it required here, since it seems to just link entity and pe. (no time here, for instance)
> This affords the same ability as the others -- to allow third-party qualification about the relationship.
> Time is just one of the specified qualifiers that an asserter can use.

For most of the involvement it would probably be sensible to associate
a time:Interval (ie. duration) with the usage/control/participation -
which I believe the DM should also allow.

( Note: PROV-O already allows this as hadTemporalValue takes
time:TemporalEntity which is either an instant or an interval -
although I would prefer to define ProcessExecution and Entity *as* an
time:TemporalEntity - then we can talk about one PE preceeding
another, etc.)

Example of how we could express control with intervals in PROV-ASN:

wasControlledBy(activity1, agent1, [prov:role="constructor",
prov:start="18:20", prov:end="18:40"])
wasControlledBy(activity1, agent1, [prov:role="monitor",
prov:start="18:50", prov:end="18:55"])

he was both a constructor and monitor - but not at the same time, thus
he was not monitoring himself)

In PROV-O we can do this now as:

:activity1 prov:hadQualifiedControl [
  prov:hadRole :constructor ;
  prov:hadQualifiedentity :agent1 ;
  prov:hadTemporalValue [
    time:hasBeginning [ time:hour 18; time:minute 20 ]
    time:hasEnd [ time:hour 18; time:minute 40 ]
  ] , [
  prov:hadRole :monitor ;
  prov:hadQualifiedentity :agent1 ;
  prov:hadTemporalValue [
    time:hasBeginning [ time:hour 18; time:minute 50 ]
    time:hasEnd [ time:hour 18; time:minute 55 ]
  ]
] .

> With respect to what falls from this pattern, a qualified complementOf found its way as a stub but was not elaborated.
> Probably needs motivation to warrant the work.

Ugh..

> So, the only thing "odd", is that to GET to it from the generated Entity, you need to follow the prov:wasGeneratedBy to the Activity and ask the activity for how the entity was contextualized.
>
> Fortunately, I think this pattern suits the nature of the problem. The entity in isolation doesn't know how it was contextualized; the Activity inherently provides that contextualization.

I think it makes perfect sense as it stands. A Generation is the very
first event an entity can be involved with. It still points "to the
past" but with distance 0.

(so that if there was an interval associated with a Generation - the
start-point of this would be when the entity was generated. (and when
its characterisation was true). Interpretation of what the activity
did with the Generation after that start point depends on the role and
activity, perhaps delivery/transmitting/saving the entity, etc. )

But.. now all *involvements* are in the future from the PE, and in the
future from the entity.

We would have to reverse all of the QualifiedInvolvement links to make
them go to the past:

:entity prov:wasInvolvedIn [
  a prov:QualifiedUsage ;
  prov:hadRole :someRole ;
  prov:involvedBy :someProcess .
]

:someProcess prov:used :entity .

There's no perfect answer. Even in ":someProcess prov:used :entity" we
don't know who existed first of the two.

-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester

Received on Wednesday, 23 November 2011 10:01:27 UTC