Re: Using PROV-O to represent ontology processing, extraction and reasoning operations from Chris Mungall on 2017-11-27 (semantic-web@w3.org from November 2017)

From: Chris Mungall <cjmungall@lbl.gov>
Date: Mon, 27 Nov 2017 15:46:56 -0800
To: Michel Dumontier <michel.dumontier@gmail.com>
Cc: Graham Klyne <gk@ninebynine.org>, semantic-web <semantic-web@w3.org>
Message-ID: <CAN9AiftuMCw4V_3rwmzVN9RfudGxDGt0GK95dnX_tAmMTo7h=Q@mail.gmail.com>
OK, so is the idea that even if I introduce my own subclasses of
prov:Activity, I should make two assertions on my activity instances

 - a redundant one directly to prov:Activity (presumably to facilitate easy
querying)
 - a specific one to my subclass

Graham mentioned OWL-QL, but technically the first redundant triple is
entailed within OWL-QL through domain/range. However, I can see why it
might be preferred for ease of querying purposes.

On Sat, Nov 25, 2017 at 3:00 PM, Michel Dumontier <
michel.dumontier@gmail.com> wrote:

> Chris,
>   seems very reasonable to use prov:wasGenerated By <activity>. feel free
> to explicitly add "a prov:Activity" to your foo/op1.
>
> m.
>
> On Sat, Nov 25, 2017 at 7:46 PM, Chris Mungall <cjmungall@lbl.gov> wrote:
>
>> I had anticipated using prov:wasAssociatedWith, and not creating any new
>> properties. The gap to be filled is on the side of subclasses of
>> prov:Activity, for representing operations specific to ontology processing.
>>
>> Assuming such an ontology, does this rough sketch seem like a standard
>> pattern? Is there a good analogous set of design patterns to follow (e.g.
>> software compilation, scientific workflows)?
>>
>> @prefix prov: <http://www.w3.org/ns/prov#> .
>> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>> @prefix owl: <http://www.w3.org/2002/07/owl#> .
>> @prefix obo: <http://purl.obolibrary.org/obo/> .
>> @prefix opera: <http://example.org/OntologyProcessingExtractionReasoningOntology/> .
>>
>> ## Ontology header
>> obo:foo.owl a owl:Ontology ;
>>    dc:title "..." ;
>>    prov:wasDerivedFrom obo:foo/foo-edit.owl ;
>>    prov:wasGeneratedBy obo:foo/op1 ;    ## see below
>>    prov:wasAttributedTo obo:foo/editors ;
>>    ...
>>
>> ## Ontology provenance
>> obo:foo/op1 a opera:ReasonOperation ;
>>    prov:used obo:foo/foo-edit.owl ;     ## 'source code'
>>    prov:wasInformedBy <configuration??> ## what to use here?
>>    prov:startedAtTime "..." ;
>>    prov:endedAtTime "..." ;
>>    prov:wasAssociatedWith <github/mavenOrGithubIRIforROBOTversion??> ;
>>
>> ## Ontology below
>> obo:FOO_0000001 a owl:Class ;
>>     ...
>>
>> On 25 Nov 2017, at 0:29, Graham Klyne wrote:
>>
>> Is there any reason that http://www.w3.org/TR/prov-o/#wasAssociatedWith
>> doesn't work for you?
>>
>> My recollection of PROV discussions is that involvement of human,
>> software and other agents in an activity was intended to be captured using
>> this (or qualifiedAssociation structures - cf.
>> http://www.w3.org/TR/prov-o/#qualifiedAssociation).
>>
>> (IIRC, the "qualifiedAssociation" and similar structures --
>> http://www.w3.org/TR/prov-o/#description-qualified-terms -- were
>> introduced to avoid application-dependent (not a priori defined)
>> sub-properties, which would break OWL-QL compatibility of PROV, with
>> attendant reasoning performance challenges.)
>>
>> #g
>> --
>>
>> On 24/11/2017 22:15, Chris Mungall wrote:
>>
>> Many ontologies such as those in the life-sciences have complex multi-step
>> release pipelines: for example, using an OWL reasoner to assert direct
>> inferred
>> subClassOf axioms, adding owl annotations, verifying using SPARQL. It
>> would be
>> useful to capture the full operation graph, so that the provenance of the
>> released ontology was explicit.
>>
>> PROV-O would provide the main framework for doing this. PROV-O predicates
>> could
>> be used directly. Existing standards could be used to represent the
>> software
>> agents involved. AFAICT there is a gap for an ontology for representing
>> the
>> ontology processing operations (subclasses of prov:Activity).
>>
>> Is there an existing effort that could be piggy-backed on here? This
>> could be
>> subsumed into an effort that seeks to represent for example
>> transformations
>> between named graphs. Alternatively, the ontology build pipeline could be
>> conceived either as a software release process or a scientific workflow.
>> For the
>> latter, there are a number of ontologies but it's not clear we'd get any
>> benefit
>> using these rather than PROV-O directly.
>>
>> If there is no existing work being done here, I'll propose a draft of
>> activity
>> classes and design patterns, and we can do a demo implementation in our
>> ontology
>> release tool [ROBOT](https://github.com/ontodev/robot). But I'd rather
>> not
>> duplicate any existing efforts.
>>
>>
>
>
> --
> Michel Dumontier
> Distinguished Professor of Data Science
> Maastricht University
> http://dumontierlab.com
>
Received on Monday, 27 November 2017 23:47:25 UTC