Re: smaller example from Simon Miles on 2011-06-19 (public-prov-wg@w3.org from June 2011)

From: Simon Miles <simon.miles@kcl.ac.uk>
Date: Sun, 19 Jun 2011 11:57:42 +0100
To: Provenance Working Group WG <public-prov-wg@w3.org>
Message-ID: <BANLkTikzRqdw-1juOAO4oOSBdGk6O4vFag@mail.gmail.com>
Jim,

I agree that querying without knowing the type of thing you are
querying about would be an unusual use case. I was more thinking of:

1. How do we explain of what you might ask or assert the provenance?
In other terms, what kind of thing do we have an "Oh yeah?" button on?

2. If it is useful to distinguish agents, executions and entities, but
these classes are for modelling convenience not because they are
absolutely mutually exclusive, then how do we explain what kinds of
things might be in one or more of those classes?

3. I think it is preferable to avoid mixing up what is represented
with how it is represented where possible, so I'd like to talk about
provenance without using graph terminology. For instance, sometimes in
OPM documents the term 'node' is used to generalise from 'artifact',
'agent' and 'process', and I think this confuses representation and
represented.

I don't have a reason for believing the generalised concept, X, would
need to be in any given serialisation of PIL or even referred to in
any query, but I'd argue that having X as part of the model makes the
rest of the model, and its use, become more explicable.

Thanks,
Simon

On 17 June 2011 13:56, Myers, Jim <MYERSJ4@rpi.edu> wrote:
> Simon,
> I'm not sure I see the issue given the example queries. I agree that we should be able to query the overall graph in any way we want -e.g.  find me a process execution that was controlled by Simon and used recipe X. I guess my question is whether process execution has to be part of a superclass to do that. I guess if you really want to ask questions like 'find me anything with dc:creator Simon' and expect to get both data and any process executions you were the agent for,  a superclass would be needed (and we'd have to agree that dc:creator was a good map to our agent-controls-process semantics, and more generally, that there are enough properties we want to add to both data and processes to make the superclass worthwhile). Do we have this case, or do we just need to make sure you can query for process executions directly.
>
>  Jim
>
>> -----Original Message-----
>> From: public-prov-wg-request@w3.org [mailto:public-prov-wg-
>> request@w3.org] On Behalf Of Simon Miles
>> Sent: Friday, June 17, 2011 6:53 AM
>> To: Provenance Working Group WG
>> Subject: Re: smaller example
>>
>> Hi Jim,
>>
>> I think I agree with your general argument against unification, and I'm also
>> aware that within the limited time of the WG, it is preferable to err towards the
>> set of concepts originally envisaged unless we find that they really get in the way
>> of expressing what needs to be expressed (as we had with the exclusion of a
>> "invariant view of" relation).
>>
>> But I do think it would be genuinely useful to allow us to assert/query the
>> provenance of executions, not just data/objects/entities. For example, I would
>> like to ask "What led to this download action being performed?", separately
>> from asking "What is the history of the downloaded data?", e.g. I believe a
>> trojan horse might be downloading things behind my back, and the downloaded
>> data has since been deleted but a log of the download action allows me to refer
>> to it.
>>
>> I wonder if we could not have the following compromise (which may be what
>> you were implying):
>>
>> We still have "executions" controlled by "agents" using and generating
>> "entities", as stated in the charter. But we also have a more general concept, X,
>> which is a generalisation of "execution", "agent" and "entity", and any other
>> category of X common enough to be worth giving a name to. To account for the
>> overlap you describe, the categories above do not have to be mutually exclusive
>> (e.g. something could be an execution and an agent if one choose to model it as
>> such).
>>
>> X would then correspond to the "thing" in the IVP definition, i.e. it is something
>> that we could assert the provenance of. As a name for X, I suggest the term
>> "thing" is not ideal, as it implies an entity not a process execution, as well as
>> clashing with owl:Thing seems more general than we want. I personally like the
>> term "occurrence".
>>
>> Does this make sense, or is it overcomplicating matters?
>>
>> Thanks,
>> Simon
>>
>> On 16 June 2011 19:06, Myers, Jim <MYERSJ4@rpi.edu> wrote:
>> >
>> >> Just to note, wouldn't a process execution fit the definition of
>> >> 'thing'? A process execution can have an identity, is invariant at
>> >> least with regards to that identity (and maybe other things like its
>> >> configuration or location), and is clearly mutable in other regards.
>> >> I've no problem with a process execution being a thing if that is
>> >> intended (a process execution does have a provenance), but it might
>> >> have implications for which term we use in place of 'thing'.
>> >
>> > I think this is closely tied to the definition of agency - we're
>> > saying clouds participate in/control/affect  a storm rather than
>> > clouds/storm are a mutable thing/process execution. My guess is that
>> > the latter model could be made to work, but will be harder to explain
>> > and while it probably would be more powerful/remove some paradoxes of
>> > agents, I'm not sure it's worth it. I'm acutely aware that this is a
>> > slippery slope if we accept immutable and mutable thing to not be
>> > different classes - it is also more complex and more powerful than a
>> > two class model, so why not go the next step? The only difference is I
>> > think we have more use cases where being able to integrate the
>> > immutable/mutable perspectives is useful. My guess is that if we
>> > formulated some of the things we really want to say about agents
>> > (human agent and oven-as-agent), unifying thing and process execution
>> > might help, but I haven't thought it through and doubt I/we can
>> > explain it well enough to justify it as the PIL model. (Said
>> > differently, I see that we'll be dealing with ontologies for things
>> > like files that consider them mutable and immutable - I'm not sure
>> > there are many ontologies we're going to hit where one considers
>> > something a thing and the other considers it a process. Perhaps that's
>> > incorrect...)
>> >
>> > Fascinating, brain bending stuff...
>> > Cheers,
>> >  Jim
>> >>
>> >> Thanks,
>> >> Simon
>> >>
>> >> On 14 June 2011 10:53, Luc Moreau <L.Moreau@ecs.soton.ac.uk> wrote:
>> >> > Hi Simon,
>> >> >
>> >> > I think we concur. I have adapted the example taking into account
>> >> > the terminology we defined with Jim yesterday. [1]
>> >> >
>> >> > It would be nice to get feedback from the working group, since we
>> >> > may want to reach agreement on Thursday.
>> >> >
>> >> > [1]
>> >> >
>> http://www.w3.org/2011/prov/wiki/ConceptInvariantViewOnThing#Defini
>> >> > tio
>> >> > n_by_Jim_and_Luc_v2_.28in_progress.29
>> >> >
>> >> > Cheers,
>> >> > Luc
>> >> >
>> >> > On 06/13/2011 02:51 PM, Simon Miles wrote:
>> >> >> Luc,
>> >> >>
>> >> >> I think the example is helpful, and I suggest the discussion at
>> >> >> the end suggests that "invariant view or perspective on a thing"
>> >> >> is not quite right. All of i0, i1, i2, i3, i4 and i5 are more
>> >> >> obviously things than views: a file, or a file with some content,
>> >> >> or the content of a message.
>> >> >>
>> >> >> Instead, I suggest we mean "thing which is invariant from some
>> >> >> perspective", i.e. what we are talking about when referring to
>> >> >> i0-i5 is the thing, not the view.
>> >> >>
>> >> >> They are all invariant in some way. For i1 to i5 they are
>> >> >> invariant from the perspective of their content, at very least.
>> >> >> For i0, it is invariant from the perspective of its identity, i.e.
>> >> >> the reason why we talk about i0 as a thing at all is that it is
>> >> >> consistently
>> >> >> (invariantly) considered the same file even if its contents are
>> >> >> changed.
>> >> >>
>> >> >> I suggest i0 can be included in the Mapping as follows:
>> >> >> "We have some Abstractions I ->  I:
>> >> >> i1 ->  i0
>> >> >> i2 ->  i0
>> >> >> i3 ->  i0"
>> >> >> (meaning the abstraction of i1 is i0 etc.) Jim used the term
>> >> >> "abstraction" in his proposal for "resource" definition, but other
>> >> >> terms may be as good.
>> >> >>
>> >> >> Thanks,
>> >> >> Simon
>> >> >>
>> >> >> On 13 June 2011 10:37, Luc
>> Moreau<L.Moreau@ecs.soton.ac.uk>  wrote:
>> >> >>
>> >> >>> Dear all,
>> >> >>>
>> >> >>> PROV-ISSUE-1
>> >> >>> PROV-ISSUE-8
>> >> >>> PROV-ISSUE-19
>> >> >>>
>> >> >>> On June 7th [1], we agreed on "In a first instance, to define the
>> >> >>> necessary concepts that allow us  to express the provenance of an
>> >> >>> invariant view or perspective on a thing".
>> >> >>> Putting this in practice turns out to be difficult.
>> >> >>>
>> >> >>> While the egg example is interesting, the scenario seems to
>> >> >>> evolve all the time. Also, I thought that, in a first instance,
>> >> >>> we may want to look at things that are digital, before seeing how
>> >> >>> our ideas extend to the non-digital world.
>> >> >>>
>> >> >>> Obviously, we have our data journalism example, but we seem to
>> >> >>> ignore. I think that we ignore it because:
>> >> >>> - it does not focus on changing things
>> >> >>> - it is not precise about how information is published/access,
>> >> >>> - it is quite long
>> >> >>> (I liked what Simon proposed for this example [2] and this
>> >> >>> inspired me here)
>> >> >>>
>> >> >>>
>> >> >>> To unblock the situation, I have:
>> >> >>> - drafted a smaller example [3], focusing on a file being updated
>> >> >>> - tried to illustrate examples of IVPTs in this example
>> >> >>> - highlighted an example of IVPT that I don't know how to handle.
>> >> >>>
>> >> >>> In this example, it would be good to see
>> >> >>> - where we have consensus
>> >> >>> - where we have disagreement
>> >> >>> - how we handle the outstanding example (i0) of IVPT
>> >> >>>
>> >> >>> Feedback by email or on wiki welcome!
>> >> >>>
>> >> >>> Cheers,
>> >> >>> Luc
>> >> >>>
>> >> >>>
>> >> >>> [1]
>> >> >>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Jun/0096.h
>> >> >>> tml
>> >> >>> [2]
>> >> >>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Jun/0069.h
>> >> >>> tml [3] http://www.w3.org/2011/prov/wiki/FileExample
>> >> >>>
>> >> >>> --
>> >> >>> Professor Luc Moreau
>> >> >>> Electronics and Computer Science   tel:   +44 23 8059 4487
>> >> >>> University of Southampton          fax:   +44 23 8059 2865
>> >> >>> Southampton SO17 1BJ               email:
>> >> >>> l.moreau@ecs.soton.ac.uk United Kingdom
>> >> >>> http://www.ecs.soton.ac.uk/~lavm
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>>
>> >>
>> _____________________________________________________________
>> >> _______
>> >> >>> __ This email has been scanned by the MessageLabs Email Security
>> >> >>> System.
>> >> >>> For more information please visit
>> >> >>> http://www.messagelabs.com/email
>> >> >>>
>> >>
>> _____________________________________________________________
>> >> _______
>> >> >>> __
>> >> >>>
>> >> >>>
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >> > --
>> >> > Professor Luc Moreau
>> >> > Electronics and Computer Science   tel:   +44 23 8059 4487
>> >> > University of Southampton          fax:   +44 23 8059 2865
>> >> > Southampton SO17 1BJ
>> >> > email: l.moreau@ecs.soton.ac.uk United Kingdom
>> >> > http://www.ecs.soton.ac.uk/~lavm
>> >> >
>> >> >
>> >> >
>> >> >
>> >>
>> _____________________________________________________________
>> >> _________
>> >> > This email has been scanned by the MessageLabs Email Security System.
>> >> > For more information please visit http://www.messagelabs.com/email
>> >> >
>> >>
>> _____________________________________________________________
>> >> _________
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Dr Simon Miles
>> >> Lecturer, Department of Informatics
>> >> Kings College London, WC2R 2LS, UK
>> >> +44 (0)20 7848 1166
>> >
>> >
>> >
>> _____________________________________________________________
>> _________
>> > This email has been scanned by the MessageLabs Email Security System.
>> > For more information please visit http://www.messagelabs.com/email
>> >
>> _____________________________________________________________
>> _________
>> >
>>
>>
>>
>> --
>> Dr Simon Miles
>> Lecturer, Department of Informatics
>> Kings College London, WC2R 2LS, UK
>> +44 (0)20 7848 1166
>
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>



-- 
Dr Simon Miles
Lecturer, Department of Informatics
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166
Received on Sunday, 19 June 2011 10:58:14 UTC