RE: Implementing statement grouping, contexts, quads and scopes from pat hayes on 2002-06-27 (www-rdf-logic@w3.org from June 2002)

From: pat hayes <phayes@ai.uwf.edu>
Date: Thu, 27 Jun 2002 15:05:28 -0500
To: "Danny Ayers" <danny666@virgilio.it>
Cc: www-rdf-logic@w3.org
Message-Id: <p05111b21b9411c95c504@[65.217.30.113]>
>Hi guys,
>
>Returning to a literal interpretation of the subject line, I'd appreciate a
>bit of clarification - essentially, what's wrong with the approach to
>statement grouping I give below.

Well, first, there is no consensus anywhere here, so what follows is 
my 2c worth, is all.

>I'm sure this kind of approach has been
>discussed at length before, but it's very difficult to remember what has
>been suggested, let alone keep track of the consensus.
>
>Ok, so let's say we have a couple of statements :
>
>[urn:Sassi] --[a:hasSpecies]--> [a:cat]
>[urn:Sambuca] --[a:hasSpecies]--> [a:cat]
>
>So what do they mean? To begin with, they're assertions about a couple of
>specific resources (Sassi & Sambuca) that make use of a given vocabulary
>(a:).

OK so far.

>But to be able to do anything worthwhile with the statements we need
>to know the context of their assertion - they are asserted in this email.

Well, that is a very controversial claim. WHY do we need to know the 
provenance in order to do anything worthwhile? What is wrong with 
having as a general assumption that one can usually forget about 
provenances (or maybe check them once and then forget about them) and 
just draw conclusions?

>The url of the email in the archives (say email:765) can be used to identify
>this. So an application can know the providence of the statements (the email
>in the archives has a record of the sender), and we have a form of
>grouping - the statements are both in the graph at the uri of this mail.

OK, maybe provenances are handy. Still, the chief trouble with this 
is that there are many other reasons, having nothing to do with 
provenance, for wanting to form statement groups, so the grouping 
machinery shouldn't be integrated with the provenance machinery.

>Years ago I overheard the police call in the name of someone they'd picked
>up to get further information (wasn't me guv ;-). The response was concise :
>known, not wanted. In the same fashion, if an application is aware of
>email:765 then it is known, and the statements above are available. This
>data can however remain 'not wanted'.
>
>In practice, an implementation could contain two knowledge bases - one
>'live' and one 'sandbox'. Each would contain a set of graphs, the difference
>being that the 'sandbox' was passive and 'live' active. By passive here I
>mean that it could be queried, but no logical inference would take place,
>unlike in the live space.

Well, this is all very interesting, but I have no idea why you want 
to go to all this trouble. What problem are you solving here? Seems 
to me that the basic picture should be much simpler. RDF graphs make 
assertions (most of the time); you can take any RDF anywhere you find 
it and draw valid conclusions from it. If you are somewhat paranoid, 
you might only want to use RDF from sources you trust: but in 
general, the trustworthiness of any conclusion is at least as good as 
that of the least trustworthy assumption you derive it from. I can 
see that your extra quaddish tags/colors/whatever might be one way to 
keep track of this, but there are many other possible ways.

>If the live space was asked about [urn:Sassi],
>then it could query the sandbox, and retrieve the the statements above, both
>of which would carry an association with email:765. The graph of email:765
>could be examined to see if the data was wanted within the current terms of
>the live space, e.g. if the sender was trusted. If so, it can be merged into
>the live graph for inference. If not, the data is simply ignored. It might
>be that this application rejects the vocabulary used (a:), say it includes
>semantic extensions that this app doesn't understand.
>
>So essentially I'm suggesting two aspects to the context/grouping issue,
>firstly that in the wild (on the web or otherwise available through public
>interfaces) the triples exist implicitly as quads, i.e.
>
>[email:765] [urn:Sassi] [a:hasSpecies] [a:cat]
>
>and that whether or not triples get asserted remains entirely a local issue,
>decided by the implementation.
>
>Putting this more strongly, we only really have two contexts - 'in the wild'
>and 'in application X'.

No, I think it will have to get much more delicate than that. For 
example, I might want to trust any source that is warranted as 
trustworthy by an ontology that I trust, or is owned by the US 
government. Others might have very different criteria, of course. In 
fact I forsee a whole economy of trustworthiness starting to emerge, 
eventually.

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Thursday, 27 June 2002 16:05:28 UTC