- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Wed, 21 Sep 2011 12:27:42 +0100
- To: James Cheney <jcheney@inf.ed.ac.uk>
- Cc: "Myers, Jim" <MYERSJ4@rpi.edu>, Graham Klyne <GK@ninebynine.org>, W3C provenance WG <public-prov-wg@w3.org>
On Tue, Sep 20, 2011 at 22:15, James Cheney <jcheney@inf.ed.ac.uk> wrote: > The thing it denotes *is* real (if X has a car in whatever situation we're talking about, then "X's car" denotes that car; otherwise there is ambiguity or vacuity). > You seem not to be distinguishing between a statement one might make about a thing, and the thing itself (but perhaps I am just getting confused). See below. Yes, but "the car" is also such a concept or statement describing some idea of what is "the thing". If I remove the wheels and change the engine, you and I might disagree about it still being the same car. That is why we simply describe them all as entity, because we can't really grasp "the real thing" because you will simply end up with yet another characterisation/idea/concept (which attributes might or might not be easy to express). > I certainly do mean that "X's Toyota" - the expression - is not a thing in the same sense as the car is. What about "A Toyota, owned by X" - compared to "A Toyota, blue" and "A Toyota, license plate #232323"? All of these are attributes that might or might not change, depending on the time duration and purpose of using that particular characterisation. > Agreed, the point is just that time is not the only context that might be needed to make sense of an expression like "X's car". A very valid point, in particular as we start talking about more abstract entities for information that easily can exist "two places at the same time" or be ambiguousness about things like a file path or content across multiple dimensions like time, location, perspective. I believe this also allows one entity to have a certain attribute "fixed" within a time-span, while another entity have equivalent attributes varying over the same time-span, even though they are both wasComplementOf a common entity. The entities consider the attribute with a different granularity, precision, etc, but they can be "equivalent" or "corresponding" as briefly described in the model document. For instance the mass-of-a-boxer attribute is seen as constant for the purpose of a boxing match, varying for the biologist (drinking water, processing energy from food) and uncertain for the physicist (the boxer keeps moving around and experiences acceleration) - but they can all agree that it is Muhammad Ali in the boxing ring for the duration of the match. > I am just saying that if "X's blue car in 2011" denotes a different (real) thing-over-time than "X's blue car in 2001", then it makes no sense to use the same identifier for both. The example may have hidden this point. Ah, this is a good example of where the "real thing" seems like it's mismatching because the entity describes something closer to a concept or role than a certain arrangements of atom chains. In everyday life we normally give names and classes to such arrangements because it is more convenient than talking about the actual atom arrangements. But if the physical presence of the car is not important, then it can easily be the same entity, for instance because we are talking about "The thing Luc own for the purpose of commuting". It would of course be strange to have a physical property like colour on such an abstract entity, but it could be OK in some circumstances, say Luc was sponsored by W3C and always had to drive a blue car to work. Then it does not matter that much which "physical car" it was that was "the blue car", we can still talk about "Luc's blue car" spanning all those years - but now we can't lock down attributes like license plate number without doing a narrower entity with prov:wasComplementOf :lucsBlueCar . Similarly http://en.wikipedia.org/wiki/Back_to_the_Future:_The_Ride#Memorabilia shows how "The De Lorean from Back to the Future" is on display - but actually several De Lorean's were used for stunts, etc. In the view of film fans these physical things, with or without various additions like Mr Fusion, are all "The time-travelling De Lorean". > All I was saying is that I have been interpreting the "id" component of an entity assertion as purely syntactic, as a placeholder for "the thing I'm talking about in this assertion", so that I can make statements about (what I believe to be) the same thing at different instants in time or different properties of it at overlapping times. The id could also happen to be a URI with some useful Web meaning, or not, and it could happen to be enough to uniquely identify the thing it denotes, or not. I think we need to distinguish between identifiers for the purpose of separating entities in the provenance (which need to pinpoint exactly which entity description we're talking about) and various "other" identifiers, which for provenance purposes are really just different kind of attributes, exactly because it depends on who assigned the identifier what scope and view of the entity is implied. The danger here is that asserters would re-use existing (semantic web) URIs for entities, although from a provenance perspective they have a narrower view of the entity than whoever assigned the identifier. One way around this is to always narrow down with a new, local entity with its own URI, having :wasComplementOf <commonURI> (implicitly saying that <commonURI> is an entity, but not describing it) - while you seem to want a variation of this, with some kind of prov:realThing <commonURI> property on a fresh local entity. Another approach would be for the asserter to actually use <commonURI> directly as a prov:Entity, but be explicit about which attributes of <commonURI> they consider charactering ("locked down"/immutable/invariant/important) within this provenance assertion graph. This pushes the requirement for named graphs and/or indirections through a prov:Asserter because different asserters can have different views on what characterises <commonURI>. This can make it tricky in situations where you don't know or want to specify these attributes (or their values) at assertion time. I believe that if we want to encourage this reuse-existing-URI-as-entity or the always-local-entity-URI approach is more of a practical matter (how difficult to write/query/reason) than a big discussion of "what is the real thing" which as pointed out we would probably never conclude. > I am still confused whether you regard an entity as an assertion (= syntactic statement ABOUT some state of affairs) or a real thing. You seem to be saying the latter, but I can't see how to adjust the semantics to reflect this. I believe it is the former, because whatever our language, that is the only way we know to talk about "the real thing" - and so the difference is or is not there depending on what you classify as "the real thing". (You could include "identifiers" in here as another way to talk about "the real thing" - but even [ owl:sameAs :X ] is a statement). > If we do want to talk about attributes that (uniquely) characterize the things they describe, we should make these assumptions explicit (e.g. for cars, state whether we consider VINs mutable or not). Perhaps the discussion happening in parallel about the use of owl:key is the way to address this. I agree - but we have already said that entities are descriptions about things within a certain perspective/time-frame/view, so would it not be implied that all attributes given to them are immutable for the purpose of that entity's use within the provenance statements? If a property is mutable, then why is it stated for that entity instead of on a narrower entity? (We would not know when that attribute applies or not). In a regular old-style RDF document, if someone says <http://soiland-reyes.com/stian#me> foaf:name "Stian Soiland-Reyes" - then that value is assumed to be true throughout that particular graph - although it has not been explicitly said who asserts this, over what time, based on which observations or assertions, etc. Similarly I believe that within a single PROV assertion graph, if the asserter says :lucsCar :colour :blue - then that's true for wherever we see :lucsCar within this graph. The distinction comes when we are doing multiple asserters - would they reuse identifiers for entities or not - which in a way is the same discussion as with the "real thing" above. Note, an attribute might be immutable/invariant, but not an (important) part of our characterisation. It could very much still be useful to include such attributes, in particular when specialising for different domains. I believe this distinction should be individual entities rather than using OWL keys on some (often artificial) class, because it depends on that particular entity what is characterising or not. > As noted above, it is not just that it is not intuitive, it is that I (at least) do not understand what you mean by entities being real (and this seems to be a basic point of cognitive dissonance among others too). One possibility would be to just rename ?things back to "entities" and say that "entity assertions" are statements about aspects of things/entities that are fixed over a period of time. This is what I did initially and Luc asked me to rename to thing to make a clearer distinction. Yes, our "Entity" is closer to an "EntityStatement" or "EntityState" than an entity itself, but we did in the end vote for prov:Entity for simplicity. Similarly our Agent is not the real agent, it is a description/identifier for the Agent - but instead of tacking "Statement" or "Description" behind every class name, we just admit that our assertion language in its nature is describing other things, and so this is made implicit. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Wednesday, 21 September 2011 11:28:41 UTC