- From: Frank Manola <fmanola@mitre.org>
- Date: Fri, 15 Feb 2002 14:53:24 -0500
- To: Sergey Melnik <melnik@db.stanford.edu>
- CC: RDFCore WG <w3c-rdfcore-wg@w3.org>
Although many of the points that are going to be made in the ensuing thread on this will have been made before (maybe even X times!), perhaps we have a different perspective on some of them this time around. Anyway, let me take a crack (hoping I'm not going to restate some heresy we've already recanted...) Sergey Melnik wrote: > > Brian asked me (for the Xth time, sigh) to express my concerns about > reification in writing. My position remains that we need an > interoperable and efficient way of doing reification. I agree with the last sentence. > > If we go for "stating" (answer "No" to Q1 in [1]), no special semantics > is associated with the vocabulary rdf:Statement, rdf:subject, > rdf:predicate and rdf:object. This means applications *cannot > interoperate* using this vocabulary since its meaning is unspecified. > Effectively, going for "stating" amounts to deprecating 4-triple > reification as used today. > > If we go for "statement" (answer "Yes" to Q1 in [1]), we get a (rather > painful, admittedly, but endorsed) way of referring to statements found > on Web pages and in RDF databases, recording provenance etc. This is IMO > much more concrete and useful that just providing no definition at all, > although the usability of 4-triple reification still remains seriosly > hampered by its verbosity. The problem is that I don't believe this is exactly right. There are a number of things involved here. And verbosity isn't the main issue. 1. The things found on Web pages and in RDF databases are *instances* of statements, otherwise known (?) as "statings". When I express provenance (at least a lot of the time), I want to talk about a particular instance in a particular place, not the single abstract thing determined by a given (subject, predicate, object) triple. I must, for example, be able to say "fred wrote <s,p,o> at location X, at time T1", and "mary wrote <s,p,o> at location Y, at time T2" and have those be distinct things, and not thereby intermingle X, Y, T1, and T2. For example, does: <stmt1> <rdf:type> <rdf:Statement> . <stmt1> <rdf:subject> <aSubject> . <stmt1> <rdf:predicate> <aPredicate> . <stmt1> <rdf:object> <anObject> . <stmt1> <ex:writtenby> <fred> . <stmt1> <ex:atTime> <T1> . <stmt2> <rdf:type> <rdf:Statement> . <stmt2> <rdf:subject> <aSubject> . <stmt2> <rdf:predicate> <aPredicate> . <stmt2> <rdf:object> <anObject> . <stmt2> <ex:writtenby> <mary> . <stmt2> <ex:atTime> <T2> . entail: <stmt2> <ex:writtenby> <fred> . <stmt2> <ex:atTime> <T1> . I think the answer has to be "NO" if you want provenance in general to work, because otherwise you can't even suggest that there are two different instances involved. [Depending on what you read into the fact that there are separate <stmt1> and <stmt2>]. 2. In the example above, <stmt1> and <stmt2> were newly-minted URIs defined as part of the process of reifying the two original occurrences of what fred and mary said. That is, in the M&S reification examples, it's assumed that you can't directly identify the statement you want to express provenance information about; instead you have to create a new URI, and then use the reification vocabulary to describe the subject, predicate, etc. of that new entity. The thing is, that: a. The best you can do with the reification vocabulary, as it is, is to express provenance information about some reification (triple quad) which you effectively claim describes (has the same subject, predicate, and object as) some actual statement occurrence that the provenance is *really* about. However, there is no way in the current reification mechanism to directly establish the relationship between the quad and the actual statement occurrence the provenance is really about. b. If the original statings each had a URI, no reification syntax would be necessary to do provenance. That is, if the actual triple written by Fred had a URI, I could simply write: <fredtriple> <ex:writtenby> <fred> . <fredtriple> <ex:atTime> <T1> . Note that for interoperability of provenance I need mutual understanding of the *provenance* vocabulary (e.g., <ex:atTime>), not just the reification vocabulary. 3. It's not clear that you couldn't continue to use the current vocabulary with the general meaning suggested by (a). You'd just have to accept that you have to take a certain amount on faith (that the reification "description" is a suitable "stand in" for the actual triple that you're describing with it). > > In summary, if we go for "stating" we have *no* official mechanism for > reification. In this case we'd have to suggest an alternative, we cannot > just wash our hands. It is an illusion that we can leave the vocabulary > undefined and at the same time recommend developers to use it in a > consistent way. If we go for "statement" we do have a solution, albeit a > poor one. If we run out of time in finding an alternative *efficient* > way of doing reification, we could of course fall back on "statement" > expressed using 4 triples. In this case, we would not have achieved much > since inefficiency proved to be a show stopper for using 4-triple > reification in the past 3 years. > I don't think it hurts to have a standard vocabulary (like the rdf:subject) for describing the "substructure" of statement occurrences. It's just that having that vocabulary doesn't get us all that far in dealing with provenance. I agree that we can't just wash our hands; too much has been made of "RDF reification". --Frank [ducking under the desk] -- Frank Manola The MITRE Corporation 202 Burlington Road, MS A345 Bedford, MA 01730-1420 mailto:fmanola@mitre.org voice: 781-271-8147 FAX: 781-271-8752
Received on Friday, 15 February 2002 14:53:39 UTC