W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > July to September 2006

Re: The Mary example (Re: Summary of BNode redundancy options (at the moment))

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 18 Aug 2006 23:47:12 -0700
Message-Id: <p06230958c10c6091309d@[192.168.1.6]>
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>

>On Aug 17, 2006, at 6:00 AM, Pat Hayes wrote:
>
>>:Mary :met :Bill
>>:Bill :nationality USA
>>:Mary :met _:x7
>>_:x7 :nationality IRAQ
>>
>>SELECT DISTINCT ?x [:Mary :met ?x]
>>
>>If you are an government agency trying to keep track of who talked 
>>to whom, it would be less than helpful to be told about Bill, but 
>>not *anything at all* about the nameless Iraqi who Mary has been in 
>>communication with, just because a semantic theory says that 
>>{:Mary, _:x7} is technically redundant. DISTINCT in this case was 
>>likely intended to mean, distinct *people*, and there is enough 
>>information here to enable a human reader to know that _:x7, 
>>whoever it is, is not Bill, even though an RDF engine might be too 
>>dumb to figure this out. Getting the answer set {Bill, _:x7} in 
>>this case tells you that there are two individuals about whom 
>>something detectable is recorded, and if we have told bnodes 
>>available then it allows a subsequent query to ask about the 
>>nationality of _:x7. (They might be the same, of course, but then 
>>so might :Bill and :Joe.)
>
>This is interesting because as I think about it, the more I become 
>convinced that this is poor modeling. At least on many scenarios 
>(i.e., the case is underdescribed). Given some remarks Pat (perhaps 
>privately) made about shifting the burden to data managers, I think 
>that I object to this modeling, and would not recommend it. A much 
>more sensible approach, especially for a vertical, curated 
>collection like a gov agency (yes, I know, they aren't that good, 
>but this isn't hard). BNodes are the wrong thing if this is the kind 
>of of interpretation of the answers. For example:
>
>
>	:Mary :met :Bill
>	:Bill :nationality USA
>	:Mary :met :unknownPerson1
>	:unknownPerson1 :nationality IRAQ
>
>The more I think about it, the better it seems. I can encode all 
>sorts of information in the uri *or* the graph. It's easily stable. 
>It's easy to *talk* about.

What it doesn't do is make any distinction between known people and 
unknown people, though. You and I know that there isn't anyone called 
"unknownPerson" , but reasoners don't. And we don't want to make 
being unknown a property of the actual person, since we might in fact 
know him. As you point out, it might be Bill.

Another style of modelling uses bnodes throughout but links them to 
names treated as literals:

_:x :name "Mary"
_:x :met _:y
_:x :met _:z
_:y :name "Bill"
_:y nationality USA
_:z nationality IRAQ

and no name for _:z, which is what makes him 'unknown'.  This is very 
much in the spirit of RDF collections and containers, of course.

>If I have any sort of equality reasoning it's pretty easy to merge 
>it when appropriate. It also allows for things like this:
>
>	:Mary :met :Bill
>	:Bill :nationality USA
>	:Mary :met :unknownPerson1
>	:unknownPerson1 :nationality IRAQ
>	:Mary :met :unknownPerson2
>	:unknownPerson2 :nationality IRAQ
>
>Which would get leaned away if the unknowns were bnodes.

*Could* get leaned away, yes. Although we could easily fix this with 
the bnode style.

>I agree that current practice uses BNodes exactly as if they were 
>these :unknownUris, but that just reinforces my overall point that 
>that interpretation is contrary to RDF semantics.

You might be right that it is better practice to use URIs. There are 
arguments both ways. But the bnode technique is not 'contrary' to the 
RDF semantics, and the introduction of the URIs amounts to 
skolemization, which is also a semantically well-understood 
technique. In some frameworks it is even valid :-) And there are 
other styles which are possible and even may have their advantages.

>  We either should change our semantics, stress this point strongly 
>in the documents and solicit serious feedback far and wide at many 
>different levels, or try to change RDF.

I see no reason to change anything here. Even if it were recommended 
practice to not use bnodes in this way, but (1) they will get used 
this way (2) its not semantically incorrect, and so (3) we should 
support it if at all possible.

Pat



-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Saturday, 19 August 2006 06:47:27 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:27 GMT