- From: Pat Hayes <phayes@ihmc.us>
- Date: Fri, 18 Aug 2006 23:47:12 -0700
- To: Bijan Parsia <bparsia@cs.man.ac.uk>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
>On Aug 17, 2006, at 6:00 AM, Pat Hayes wrote: > >>:Mary :met :Bill >>:Bill :nationality USA >>:Mary :met _:x7 >>_:x7 :nationality IRAQ >> >>SELECT DISTINCT ?x [:Mary :met ?x] >> >>If you are an government agency trying to keep track of who talked >>to whom, it would be less than helpful to be told about Bill, but >>not *anything at all* about the nameless Iraqi who Mary has been in >>communication with, just because a semantic theory says that >>{:Mary, _:x7} is technically redundant. DISTINCT in this case was >>likely intended to mean, distinct *people*, and there is enough >>information here to enable a human reader to know that _:x7, >>whoever it is, is not Bill, even though an RDF engine might be too >>dumb to figure this out. Getting the answer set {Bill, _:x7} in >>this case tells you that there are two individuals about whom >>something detectable is recorded, and if we have told bnodes >>available then it allows a subsequent query to ask about the >>nationality of _:x7. (They might be the same, of course, but then >>so might :Bill and :Joe.) > >This is interesting because as I think about it, the more I become >convinced that this is poor modeling. At least on many scenarios >(i.e., the case is underdescribed). Given some remarks Pat (perhaps >privately) made about shifting the burden to data managers, I think >that I object to this modeling, and would not recommend it. A much >more sensible approach, especially for a vertical, curated >collection like a gov agency (yes, I know, they aren't that good, >but this isn't hard). BNodes are the wrong thing if this is the kind >of of interpretation of the answers. For example: > > > :Mary :met :Bill > :Bill :nationality USA > :Mary :met :unknownPerson1 > :unknownPerson1 :nationality IRAQ > >The more I think about it, the better it seems. I can encode all >sorts of information in the uri *or* the graph. It's easily stable. >It's easy to *talk* about. What it doesn't do is make any distinction between known people and unknown people, though. You and I know that there isn't anyone called "unknownPerson" , but reasoners don't. And we don't want to make being unknown a property of the actual person, since we might in fact know him. As you point out, it might be Bill. Another style of modelling uses bnodes throughout but links them to names treated as literals: _:x :name "Mary" _:x :met _:y _:x :met _:z _:y :name "Bill" _:y nationality USA _:z nationality IRAQ and no name for _:z, which is what makes him 'unknown'. This is very much in the spirit of RDF collections and containers, of course. >If I have any sort of equality reasoning it's pretty easy to merge >it when appropriate. It also allows for things like this: > > :Mary :met :Bill > :Bill :nationality USA > :Mary :met :unknownPerson1 > :unknownPerson1 :nationality IRAQ > :Mary :met :unknownPerson2 > :unknownPerson2 :nationality IRAQ > >Which would get leaned away if the unknowns were bnodes. *Could* get leaned away, yes. Although we could easily fix this with the bnode style. >I agree that current practice uses BNodes exactly as if they were >these :unknownUris, but that just reinforces my overall point that >that interpretation is contrary to RDF semantics. You might be right that it is better practice to use URIs. There are arguments both ways. But the bnode technique is not 'contrary' to the RDF semantics, and the introduction of the URIs amounts to skolemization, which is also a semantically well-understood technique. In some frameworks it is even valid :-) And there are other styles which are possible and even may have their advantages. > We either should change our semantics, stress this point strongly >in the documents and solicit serious feedback far and wide at many >different levels, or try to change RDF. I see no reason to change anything here. Even if it were recommended practice to not use bnodes in this way, but (1) they will get used this way (2) its not semantically incorrect, and so (3) we should support it if at all possible. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Saturday, 19 August 2006 06:47:27 UTC