- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Sat, 17 Sep 2011 01:19:32 +0100
- To: Timothy Lebo <lebot@rpi.edu>
- Cc: public-prov-wg@w3.org
On Fri, Sep 16, 2011 at 23:48, Timothy Lebo <lebot@rpi.edu> wrote: > Why are these characterizing entities NOT on the Entity itself? > > Entity is _already_ providing us the indirection that we need to distinguish between EVERYBODY'S description of the car IN ALL ETERNITY and _our_ description as we are observing it for our time period and context. I fully agree! That is what the entity IS - it *is* the description - and so it should have some properties! > One of the SPECIAL characterizingProperties is prov:wasComplementOf, which is pointing at a URI of the "invariant" car in my driveway. This is a *very* important point! Thanks for picking this up. Also a good argument for not using rdf:List as I assume we want to implicitly include that for any entity. > :owner rdfs:subPropertyOf prov:characterizingProperty . # Though, why can't we just look at ALL RDF properties as characterizing entities? Well, see my example in http://www.w3.org/2011/prov/wiki/WorkflowExample#Provenance_container_example Here the provenance ontology has been extended with domain-specific provenance metadata using CURIEs wf: and impl:. :input a prov:Entity, impl:FileValue, wf:Value ; prov2:characterizingProperties ( impl:value ) ; impl:file "/tmp/myinput.txt" ; impl:value [ # Snapshot of actual value as it was read by :wfEngine a cnt:ContentAsText ; cnt:characterEncoding "UTF-8" ; cnt:chars "Steve" ] . Here I wanted to say that what characterises this input is the impl:value (ie. content of the file). The reason is that impl:file (although a very useful property to include in the provenance assertions) is not necessarily constant for the duration of the use of this :input within the workflow - by the time the value is used the file could have been deleted, overwritten or moved, but for the workflow engine this does not matter, only the impl:value mattered for the process executions where this entity was used. However I might make a complementOf of :input if I want to describe the data at the point when it was read from the file, when the filename was constant as well as the content. So if I understand you correct, by your proposal to 'just use RDF' I would only state properties for those entities where it is characterising? (as entities already are indirections for "things in the world") :input a prov:Entity, wf:Value ; impl:value [ a cnt:ContentAsText ; cnt:characterEncoding "UTF-8" ; cnt:chars "Steve" ] . :inputFile a prov:Entity, impl:FileValue, wf:Value ; prov:wasComplementOf :input ; impl:file "/tmp/myinput.txt" ; impl:value [ a cnt:ContentAsText ; # Is impl:value still needed, or included by indirection of wasComplementOf? cnt:characterEncoding "UTF-8" ; cnt:chars "Steve" ] . I wouldn't include much more details of how :inputFile was generated as the workflow engine did not record any details about when or how the file was read, and it is not used by any asserted PEs - but I am of course still free to assert that such an entity existed. "At some point during :input's life (over which the content is fixed) there was an impl:file that had that content" - which sounds quite good. Perhaps this is the line Satya was thinking about with attributes the whole time, leading to the initial confusion on Luc's question. Satya? > One level down, or simply directly on the :Entity? With that I considered that we have not decided yet exactly where the attributes would go, in my proposal one level down *is* directly on the prov:Entity. > Simply on the prov:Entity allows any level of "down" because it's just domain-specific OWL axioms. Could you elaborate with a little example? Do you mean thanks to including the complementOf property or by referring to other prov:Entities? (I like this prefix!) > Agreed. But how does that prevent what you think we can't say ("being owned by luc and any other owner")? With your proposal that would not be a problem, you are right, because that other owner would not be declared on "my" prov:Entity, but on a different car entity or perhaps a non-entity resource (thing in the world). > I don't think we need to assume named graph scoping. We have the indirection with prov:Entity already and these entities are being grouped by Accounts - so they can float around in the Big Graph and nothing will break. This is a side note not yet addressed by the current ontology. So you are saying we can use prov2:Accounts to link to who 'defined' the entities (and also process executions et al?) - and not be restricted to one provenance account == one graph/resource. To me the reading of accountExpression in https://dvcs.w3.org/hg/prov/raw-file/8be7e9ea81f0/model/ProvenanceModel.html#expression-Account was asking for named graphs, in particular when talking about shared identifiers. However defining accounts flatly requires any different entities to have different URIs (which can be be consolidated in a different way, for instance a common wasComplementOf). My understanding by an example: Both accounts try to express how a workflow ran, but account B did so by monitoring from the outside, instead of logging directly what happened as in account A. They therefore might not always agree on things like file content. :accountA a prov2:Account ; prov2:expresses :entity1, :entity1a, :entity2 . :accountB a prov2:Account ; # Is he allowed to re-use :entity1 from :accountA, or would each account always make his own entities? prov2:expresses :entity1, :entity1b, :entity3 . :entity1 impl:file "/tmp/myinput.txt" . # Implied by wasComplementOf? #:entity1a impl:file "/tmp/myinput.txt" . #:entity1b impl:file "/tmp/myinput.txt" . :entity2 impl:value [ cnt:chars "Fish" ] . :entity3 impl:value [ cnt:chars "Soup" ] . :entity1a prov:wasComplementOf :entity1, :entity2 . :entity1b prov:wasComplementOf :entity1, :entity3 . # implied both impl:file and impl:value ? >> This one is unfortunately tricky in SPARQL as rdf:List are really >> unpacked linked nodes and we don't know the position of the attribute. > (Although I disagree with the premies) > Why would order matter? Side note: To know how many rdf:next to follow, unless you know a cool trick to do recursion or "is-in-list" -support into sparql..? I know you can express lists with () syntax, but you still need to know in which position to put the ?thing. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Saturday, 17 September 2011 00:20:21 UTC