Re: Suggested Concepts for Charter from Paulo Pinheiro da Silva on 2010-11-25 (public-xg-prov@w3.org from November 2010)

From: Paulo Pinheiro da Silva <paulo@utep.edu>
Date: Wed, 24 Nov 2010 20:16:27 -0700
To: "<public-xg-prov@w3.org>" <public-xg-prov@w3.org>
Message-ID: <4CEDD50B.2050506@utep.edu>
Dear All,

I added a list of PML concept to the Wiki and moved Jim’s comments 
in-line to the list of concepts. Please note that this list is 
restricted to PML concepts that do not overlap with existing OPM 
concepts. To keep the list small, I did not include the properties of 
these concepts and some of their specializations.

pmlp:IdentifiedThing: The abstract root of provenance related concepts.
   It organizes a collection of common metadata about the referenced
   object, and it does not have any instance

   - pmlp:InferenceRule: It is the recipe of a process. We can say
     that it is the rule applied on the input information of a process
     execution and used to derive the product (or conclusion) or the
     process execution. In the Cake scenario, it is the recipe for Bake.

      * pmlp:DeclarativeRule: It is an inference rule (or recipe) that
        describes the logics of the transformation of input data into a
        product without specifying how the transformation occurs. It is
        often used for representing formal inference rules including
        deductive and inductive rules

      * pmlp:MethodRule: It is an inference rule that describes how a
        product is derived from input information (e.g., an algorithm
        that describes how its result is derived from the algorithm’s
        arguments).  This kind of inference rule is also use to
        represent named recipes where the exact way input information is
        transformed is unknown (e.g., “black boxes”);

   - pmlp:Source (it is a generalization of opm:Agent): It is an
     identified thing from where we obtain information

      * pmlp:Agent: An actionable entity capable of asserting information

      * pmlp:Document: A physical information container that is not
        actionable. They function like database

      * pmlp:DocumentFragment: A fragment of document that can be used
        as source

   - pmlp:SourceUsage: it is the connection between a source (i.e.,  a
     mutable identified thing such as an agent or document) and
     information (i.e., immutable things) obtained from the source

JustificationElement:

   - pmlp:NodeSet: The justification collection for a resource is a
     directed acyclic graph of node sets connected by inference steps.
     Each node set has a conclusion and any number of inference steps
     including zero. We have speculated whether opm:Account is a
     mechanism for alternative or complementary provenance, reason why
     we keep this concept in the list

   - pmlp:Query:It is a formal representation of user's question. For
     example, the interest of a customer in a cake was triggered by the
     following request from the customer: “What are the desserts
     available today in your restaurant?”

Many thanks,
Paulo.

> Dear all,
>
> I just added the concepts from the Provenance Vocabulary on the wiki
> page [1].
>
> * prv:Actor - It is broader than opm:Agent. Each opm:Agent is directly
> related to a process (OPM defines opm:Agent as "a catalyst of a
> process"). A prv:Actor can be basically any active entity. This includes
> entities that are directly involved in the processes described (as
> represented by opm:Agent) but also entities that are not directly
> involved (e.g. the person who maintains the Web server that served a
> prv:DataItem in a prv:DataAccess execution).
>
> * prv:involvedActor - prv:involvedActor refers to active entities that
> were somehow involved in the execution of a process. It is broader than
> opm:wasControlledBy because this involvement does not necessarily mean
> that the referent was responsible for controlling the execution.
>
> * prv:containedBy - refers to a data item that contained a data item.
>
> * prv:operatedBy - refers to a human actor who was operating a non-human
> actor at the time the provenance description refers to. OPM does not
> have any properties between opm:Agent.
>
> * prv:usedBy - refers to a data publisher (a human actor) who used a
> data providing services (a non-human actor) at the time the provenance
> description refers to. Again, OPM does not properties between opm:Agent.
>
> [1]
> http://www.w3.org/2005/Incubator/prov/wiki/Proposal_for_a_Working_Group_on_Provenance#The_Provenance_Vocabulary
>
>
> cheers,
>
> Jun
>
>
> Paul Groth wrote:
>> For the grouping I was just thinking putting everything with the same concept together. E.g provenier:haspart and dc:haspart
>>
>>
>> Paul
>>
>> Sent from my iPhone
>>
>> On Nov 24, 2010, at 0:05, Paulo Pinheiro da Silva<paulo@utep.edu>  wrote:
>>
>>> Hi All,
>>>
>>> I see that Jim added some PML concepts to the list of suggested concepts along with some comments -- thank you a lot Jim.
>>>
>>> Considering Paul's suggestion of grouping the suggested concepts for the charter, I would like to know the group opinion about implementing a minimal grouping of the concepts into "provenance data" and "provenance metadata." Please note that the group has already discussed the relevance of these two categories during one of our meetings.
>>>
>>> Many thanks,
>>> Paulo.
>>>
>>>> Most of the concepts seem reasonable to me. I think some overlap more or
>>>> less with dublin core and opm. Hopefully we can pull these together in
>>>> groupings.
>>>
>>>>
>>>> Sent from my iPad
>>>>
>>>> On Nov 23, 2010, at 8:34 PM, Satya Sahoo<sahoo.2@wright.edu
>>>> <mailto:sahoo.2@wright.edu>>  wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>>
>>>>> The following is a list of suggested terms from the Provenir ontology
>>>>> for submission with WG charter. I have also added the concepts to the
>>>>> wiki.
>>>>>
>>>>>
>>>>> Any feedback is welcome.
>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> Best,
>>>>> Satya
>>>>>
>>>>>
>>>>> 1. provenir:part_of
>>>>> Definition: This property is used to represent parthood relation
>>>>> between entities (both class and instance-level).
>>>>> Example: A mass analyzer is part of a mass spectrometer
>>>>>
>>>>>
>>>>> 2. provenir:contained_in
>>>>> Definition: This property is used to represent containment relation
>>>>> between entities.
>>>>> Example: A temperature sensor is contained in an ocean buoy.
>>>>>
>>>>>
>>>>> 3. provenir:adjacent_to
>>>>> Definition: Spatial proximity is represented by this property. It is
>>>>> defined only for agent class, where the adjacent spatial location of
>>>>> individuals of agent class may have an effect on data values.
>>>>> Example: Quality of observations made by a sensor may be affected if
>>>>> it is adjacent to a sensor generating a magnetic field.
>>>>>
>>>>>
>>>>> 4. provenir:transformation_of
>>>>> Definition: This property is similar to the ro:transformation_of
>>>>> property that is asserted between two entities that preserve their
>>>>> identity between the two transformation stages.
>>>>> Example: An cancer cell is a transformation of a normal cell
>>>>>
>>>>>
>>>>> 5. provenir:preceded_by
>>>>> Definition: This property is used define a temporal ordering of
>>>>> processes, which may or may not be modeled be linked by a common artifact.
>>>>> Example: Example from RO, aging preceded by development.
>>>>>
>>>>>
>>>>> 6. provenir:located_in
>>>>> Definition: An instance of data or agent is associated with exactly
>>>>> one spatial region that is its exact location at given instance of time.
>>>>> Example: A sensor is located in a specific geospatial region at time
>>>>> instance t
>>>>>
>>>>>
>>>>> 7. provenir:has_temporal_value
>>>>> Definition: This property is used to explicitly associate temporal
>>>>> value with individuals of Provenir classes.
>>>>> Example: duration of a liquid chromatography process has temporal
>>>>> value 20 minutes.
>>>>>
>>>>>
>>>>> 8. provenir: preceded_by*
>>>>> Definition: Defines a temporal (and causal or non-causal) property for
>>>>> distinct instances of provenir:process.
>>>>> Example: A researcher starts a process to send email about the status
>>>>> of an (long-running) experiment process. The notification process is
>>>>> preceded by the experiment process.
>>>>>
>>>>>
>>>>> 9. provenir:has_participant @
>>>>> Definition: Property linking data to process, where the individual of
>>>>> data class participates in a process.
>>>>> Example: Trypsin enzyme (used to digest protein sample) participates
>>>>> in a proteome analysis experiment
>>>>>
>>>>>
>>>>> 10. provenir:derives_from $
>>>>> Definition: Property represents the derivation history of data
>>>>> entities as a chain or pathway.
>>>>> Example: The average rainfall (specific to geospatial-temporal
>>>>> instance) is derived from sensor readings.
>>>>>
>>>>>
>>>>> 11. provenir:temporal_parameter&
>>>>> Definition: This class captures the temporal details associated with
>>>>> individuals of provenir:data_collection, provenir:process, and
>>>>> provenir:agent.
>>>>> Example: The timestamp associated with a sensor reading
>>>>> Example: The duration of a protein analysis process
>>>>> Example: The time period during which a sensor was working correctly
>>>>>
>>>>>
>>>>> 12. provenir:spatial_parameter
>>>>> Definition: The spatial metadata associated with instances of
>>>>> provenir:process or provenir:agent or provenir:data_collection classes
>>>>> is represented by this class.
>>>>> Example: The geographical location of an ocean buoy is an example of
>>>>> spatial parameter.
>>>>>
>>>>>
>>>>> *Notes*:
>>>>> * Unlike opm:wasTriggeredBy, provenir:preceded_by property links
>>>>> processes that may or may not be causally dependent.
>>>>> @ Unlike opm:used, provenir:has_participant may or may not represent
>>>>> an existential relationship between the provenir:data and
>>>>> provenir:process, in other words the provenir:process may or may not
>>>>> require the existence of the provenir:data to initiate/terminate.
>>>>> $ Unlike opm:wasDerivedFrom, provenir:derives_from may or may not
>>>>> represent an existential relationship between entities.
>>>>> &  Extensions of the Provenir ontology, such as the Janus ontology for
>>>>> Taverna, and Parasite Experiment ontology for biomedicine, use the
>>>>> OWL:Time ontology terms to represent temporal notions.
>>>>>
>>>>>
>>>>> The following Provenir terms were approximately to OPM terms during
>>>>> the mapping exercise, but often represented broader notions of
>>>>> provenance (see the mapping wiki for details). These terms need to be
>>>>> considered during the refinement of the corresponding OPM terms:
>>>>> 1. provenir:data
>>>>> Definition: This class models BFO continuant entities that represent
>>>>> the starting material, intermediate material, end products of a
>>>>> scientific experiment, and parameters that affect the execution of a
>>>>> scientific process. Data inherit the properties of continuants such as
>>>>> enduring or existing while undergoing changes.
>>>>> Example: A protein sample, digested with trypsin proteolytic enzyme,
>>>>> used as input in a proteome analysis experiment.
>>>>>
>>>>>
>>>>> 2. provenir:process
>>>>> Definition: This class models the occurrent entities that affect
>>>>> (process, modify, create, delete among other dynamic activities)
>>>>> individuals of data.
>>>>> Example: The proteome analysis experiment is a process and its
>>>>> constituent steps, are also processes
>>>>>
>>>>>
>>>>> 3. provenir:agent
>>>>> Definition: This class models the continuant entities that causally
>>>>> affect the individuals of process.
>>>>> Example: The researcher performing the proteome analysis experiment
>>>>> and microarray instrument used in the experiment are agents.
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>> From: Paul Groth<pgroth@gmail.com<mailto:pgroth@gmail.com>>
>>>>> Date: Monday, November 22, 2010 4:43 pm
>>>>> Subject: Suggested Concepts for Charter
>>>>> To: "<public-xg-prov@w3.org<mailto:public-xg-prov@w3.org>>"
>>>>> <public-xg-prov@w3.org<mailto:public-xg-prov@w3.org>>
>>>>> Cc: Luc Moreau<L.Moreau@ecs.soton.ac.uk
>>>>> <mailto:L.Moreau@ecs.soton.ac.uk>>
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> As we discussed on the call from Friday last week, below is the
>>>>>> list of
>>>>>> core concepts from OPM that we think should be in the list that
>>>>>> goes
>>>>>> with the charter.
>>>>>>
>>>>>> I actually think there is quite a bit of overlap with the
>>>>>> suggested
>>>>>> concepts from Jim McCusker. Also, from the mappings activity, we
>>>>>> know
>>>>>> these overlap with most of the provenance ontologies.
>>>>>>
>>>>>> If no one objects, I would like to put all the concepts we are
>>>>>> all
>>>>>> sending to the mailing list on the wiki and start to group them
>>>>>> together.
>>>>>> Does that sound good to everyone?
>>>>>>
>>>>>> Comments are appreciated especially if any concept is thought to
>>>>>> be
>>>>>> unnecessary. I'm looking forward to seeing the proposed concepts
>>>>>> from
>>>>>> everyone else.
>>>>>>
>>>>>> Hopefully, we can reach a consensus soon.
>>>>>>
>>>>>> Thanks,
>>>>>> Paul
>>>>>>
>>>>>>
>>>>>> Suggest Concepts from OPM
>>>>>> We use opm: as a short cut for open provenance model.
>>>>>>
>>>>>> Graph:
>>>>>> - opm:OPMGraph
>>>>>> Definition: a provenance graph is defined to be a record of a
>>>>>> past execution
>>>>>> Example: Bob's Website Factory provides proof in the form
>>>>>> of a
>>>>>> provenance graph that the contract was executed as agreed.
>>>>>>
>>>>>> - opm:Account
>>>>>> Definition: An account of the some past execution. Accounts
>>>>>> offer
>>>>>> different levels of explanation for the same execution
>>>>>> Example: Bob's Website Factory and Customers Inc both provide
>>>>>> two
>>>>>> different and conflicting sets of information (i.e. accounts)
>>>>>> describing
>>>>>> the provenance of the production of the the same website.
>>>>>>
>>>>>>
>>>>>> Nodes:
>>>>>> - opm:Artifact
>>>>>> Definition: Immutable piece of state, which may have a physical
>>>>>> embodiment in a physical object, or a digital representation in
>>>>>> a
>>>>>> computer system.
>>>>>> Example: BlogAgg would like to know the state of an image before
>>>>>> and
>>>>>> after modification to see if it was modified appropriately
>>>>>>
>>>>>>
>>>>>> - opm:Process
>>>>>> Definition: Action or series of actions performed on or depend
>>>>>> upon
>>>>>> artifacts, and resulting in new artifacts.
>>>>>> Example: Alice collects data from public sources and
>>>>>> "natural
>>>>>> experiment" data. Alice then processes and interprets the
>>>>>> results and
>>>>>> writes a report summarizing the conclusions. All these steps
>>>>>> should be
>>>>>> captured.
>>>>>>
>>>>>> - opm:Agent (*1)
>>>>>> Definition: Contextual entity acting as a catalyst of a process,
>>>>>> enabling, facilitating, controlling, or affecting its execution.
>>>>>> Example: Alice starts and facilities the tool SPSS when doing
>>>>>> data analysis.
>>>>>>
>>>>>>
>>>>>> Edges:
>>>>>> - opm:Time (*2)
>>>>>> Example: BlogAgg wants to find the correct originator of the
>>>>>> microblog
>>>>>> who first got the word out.
>>>>>>
>>>>>> - opm:Role
>>>>>> Definition: A role designates an artifact’s or agent’s function
>>>>>> in a process
>>>>>> Example: Whether a data file was used as a training or test data
>>>>>> set
>>>>>> when running machine learning algorithms.
>>>>>>
>>>>>> - opm:Used, opm:UsedStar
>>>>>> Definition: property to express that an artifact was used by a
>>>>>> process.Example: The panda image was used by BlogAgg to generate
>>>>>> a thumbnail image.
>>>>>>
>>>>>> - opm:WasGeneratedBy, opm:WasGeneratedByStar,
>>>>>> Definition: property to express that an artifact was generated
>>>>>> by a process.
>>>>>> Example: A thumbnail image was generated by Blog Agg using the
>>>>>> panda image.
>>>>>>
>>>>>> - opm:WasControlledBy (*1)
>>>>>> Definition : property to express that a process was controlled
>>>>>> an agent.
>>>>>> Example: SPSS was controlled by Alice.
>>>>>>
>>>>>> - opm:WasDerivedFrom, opm:WasDerivedFromStar,
>>>>>> Definition: property to express that an artifact was derived
>>>>>> from
>>>>>> another artifact.
>>>>>> Example: The thumbnail image was derived from the panda image.
>>>>>>
>>>>>> - opm:WasTriggeredBy
>>>>>> Definition: property to express that a process was triggered by
>>>>>> another
>>>>>> process.
>>>>>> Example: Report writing was triggered by the interpretation of
>>>>>> results.
>>>>>>
>>>>>> Extensibility (*3):
>>>>>> - Some form of annotation, based on predicate-value pairs.
>>>>>> Example: The data is of type a customer sales records. The data
>>>>>> has size
>>>>>> 100 megabytes.
>>>>>>
>>>>>> - Profile mechanisms, including common types, common annotations,
>>>>>> and common graph templates
>>>>>> Example: The image has a creative commons attribution license.
>>>>>> This
>>>>>> pattern represents the exchange of messages in the http protocol.
>>>>>>
>>>>>>
>>>>>> (*) indicates terms that require refinement
>>>>>> (*1) Requires better, stricter guidelines for better inter-operabiltiy
>>>>>> (*2) To be better aligned on Time ontology
>>>>>> (*3) To be better specified to facilitate extensibility and to
>>>>>> be better aligned with RDF-like annotations
>>>>>>
>>>>>>
>>
>
>
> --
> Dr Jun Zhao
> Image Bioinformatics Research Group
> Department of Zoology
> University of Oxford
> OX33 1SL
> Email: jun.zhao@zoo.ox.ac.uk
> Phone: +44 (0) 1865 281 094
>
Received on Thursday, 25 November 2010 03:16:59 UTC