RE: Some thoughts about the revised provenance Model document

> >> cf:e2 a prov:Entity.
> >> cf:e2 cf:hasLocation dbpedia:Berlin.
> >> dbpedia:Berlin dbpedia-owl:leader dbpedia:Klaus_Wowereit.
> >> dbpedia:Klaus_Wowereit dbpprop:nationality dbpedia:Germany.
> >>
> >> Obviously, I can just keep building this massive graph using linked data.
> >> If that's the case what characterizes cf:e2?
> >> Is it just cf:hasLocation dbpedia:Berlin or is it everything else?
> >
> > IMO:
> >
> > Only
> >     cf:e2 cf:hasLocation dbpedia:Berlin .
> >
> > would be characterizing cf:e2.
> 
> +1

I agree, and this set of statements implies that the entity is characterized by being in Berlin 'however' Berlin changes. The art exhibit was in Berlin for two months and whether those spanned an election or not does not affect the assertion. (Nor would Berlin annexing a neighboring village and changing its boundaries, etc.)

I think this brings up the larger question of what the additional RDF statements mean, e.g. are they part of the provenance assertions/does the provenance model give them meaning or are they just non-modeled annotation that, if you understand that third-party language, you can use to find the provenance assertions and entities of interest? If they're out of scope, we can perhaps not talk about them at all, but if we want to allow such statements to, for example, be included in an account and asserted, we may need to define their meaning - i.e. when do we assume Klaus was the leader (when the account was created, some point during the entity's lifetime (perhaps ambiguous if we have multiple entities around), over the duration of an entity/all entities?)

I suggested previously that we might just use the account boundary as a way to distinguish when a statement such as 
cf:e2 cf:hasLocation dbpedia:Berlin was meant to imply that e2 has a fixed/characterizing location of Berlin, versus having the general RDF meaning (as above - no lifetime specified and therefore unclear whether it is asserted as fixed). If we want to ascribe additional meaning to statements about , for example, who was the leader of Berlin, as part of provenance, does an account also help, e.g. putting such an RDF statement in an account is an assertion that it is true for the duration of the account? (And if that's not what you mean, you create an entity to say when it is true, or you leave that statement out of the account/send it as a helpful annotation outside the model).

> 
> > dbpedia:Berlin is not characterized - unless it was also a prov:Entity.
> >
> >
> > Now I don't know the answer for anonymous nodes:
> >
> > cf:e2 a prov:Entity.
> > cf:e2 cf:hasLocation [
> >     dbpedia-owl:leader [
> >        foaf:name "Klaus Wowereit"
> >        dbpprop:nationality dbpedia:Germany
> >      ],
> > ]
> >
> >
> > My simple reading of this is that cf:e2 has a location of somewhere
> > where the German called Klaus Wowereit is "the leader" - but neither
> > Hr Bürgermeister Wowereit or the implied Berlin is a "characterising
> > attribute".
> >
> > If we distance ourselves slightly from the notions of "characterising
> > attributes" we can just say that the properties stated directly on (or
> > with?) an entity was true/fixed attributes throughout the lifetime of
> > the entity. Any nested propertioes might or might not have been true/
> > throughout that lifetime.  (Thus cf:e2 could have existed in Berlin
> > before Hr Wowereit became the mayor).
> 
> That works for me.

I'm not sure we can get an unambiguous interpretation with this model - if Wowereit was mayor of one town and moved to Berlin an became Mayor there, what does the blank node resolve to for provenance purposes. E.g. for RDF, with no sense of time, the interpretation of the blank node can change, but when we connect it to an entity and intend for the location of that entity to be fixed, we need a way to pick one answer.

I would have read this as e2 as having the location where Wowereit was the leader, which could have been Berlin for part of the time and some other location (Klaus has a favorite paperweight and he's kept it with him wherever he has been leader, and we want to record the provenance of that paperweight - as-a-symbol-of-office-in-his-leadership location so we created e2 to be that characterization. 

If I wanted to have the notion that e2's location was Berlin (as a blank node), I would need add a time constraint - e.g. the location Wowereit led after he won the 200x election. That's an entity defined by its provenance and characterizing attributes that, in this example, would be a complement of Berlin corresponding to Berlin-as-led-by-Wowereit.

No sure how often anyone will want to do this in practice, but I suspect that for the original example and blank node cases, I think the only way to be unambiguous is to define the time range over which the plain RDF statements are asserted to be valid (or treat them as annotations who's meaning is subject to interpretation/out-of-band knowledge.) 'For the lifetime of the account' seems like a good default for that time range. And if you don't want that meaning, you create entities as in the next quoted section below.

 Jim

> 
> #g
> --
> 
> > I suggest that if you also want to lock down such things, then do
> > those properties as other prov:Entities, (either anonymous or named):
> >
> > cf:e2 a prov:Entity ;
> >    cf:hasLocation cf:berlinWithKlaus .
> >
> > cf:berlinWithKlaus a prov:Entity, prov:Location ;
> >    prov:wasComplementOf  dbpedia:Berlin ;
> >    dbpedia-owl:leader cf:klausTheMayor .
> >
> > cf:klausTheMayor a prov:Entity ;
> >      prov:wasComplementOf dbpedia:Klaus_Wowereit ;
> >      dbpedia:Klaus_Wowereit dbpprop:nationality dbpedia:Germany .
> >
> >
> > Thus throughout the lifetime of cf:e2, the thing described by e2 was
> > in Berlin, and throughout that time (at least as long as e2 existed)
> > Klaus Wowereit was the leader, being German (The pre-1990
> > Klaus-the-West-German was not the leader during the lifetime of
> > cf:berlinWithKlaus).
> >
> >
> > Note that such an interpretation would introduce temporal dependencies
> > between cf:e2 and cf:klausTheMayor which are not currently covered by
> > PROV-DM (there are no prov:derivedFrom or wasComplementOf links
> > between cf:e2 and Berlin) - if the provenance otherwise showed that
> > Klaus became mayor (when cf:klausTheMayor was generated) *afte*r cf:e2
> > was generated, then the provenance account is inconsistent,  but this
> > can't be shown by the constraints of PROV-DM as far as I can tell.
> >
> >
> >
> > Note that PROV-DM does not specifically allow such nesting of
> > attribute values, there all attribute values are strings. If a
> > property value was to be interpreted as a URI or identifier of another
> > entity or other resource, than that seems outside of scope for PROV-DM
> > - so we can take the same view in PROV-O.
> >
> >

Received on Thursday, 20 October 2011 15:39:05 UTC