RE: Implementing statement grouping, contexts, quads and scopes from Danny Ayers on 2002-06-27 (www-rdf-logic@w3.org from June 2002)

From: Danny Ayers <danny666@virgilio.it>
Date: Thu, 27 Jun 2002 23:49:54 +0200
To: "pat hayes" <phayes@ai.uwf.edu>
Cc: <www-rdf-logic@w3.org>
Message-ID: <EBEPLGMHCDOJJJPCFHEFKECOGNAA.danny666@virgilio.it>
>>But to be able to do anything worthwhile with the statements we need
>>to know the context of their assertion - they are asserted in this email.
>
>Well, that is a very controversial claim. WHY do we need to know the
>provenance in order to do anything worthwhile? What is wrong with
>having as a general assumption that one can usually forget about
>provenances (or maybe check them once and then forget about them) and
>just draw conclusions?

When did you last see the acronym GIGO?

>>The url of the email in the archives (say email:765) can be used
>to identify
>>this. So an application can know the providence of the statements
>(the email
>>in the archives has a record of the sender), and we have a form of
>>grouping - the statements are both in the graph at the uri of this mail.
>
>OK, maybe provenances are handy. Still, the chief trouble with this
>is that there are many other reasons, having nothing to do with
>provenance, for wanting to form statement groups, so the grouping
>machinery shouldn't be integrated with the provenance machinery.

That sounds reasonable.

>>Years ago I overheard the police call in the name of someone they'd picked
>>up to get further information (wasn't me guv ;-). The response
>was concise :
>>known, not wanted. In the same fashion, if an application is aware of
>>email:765 then it is known, and the statements above are available. This
>>data can however remain 'not wanted'.
>>
>>In practice, an implementation could contain two knowledge bases - one
>>'live' and one 'sandbox'. Each would contain a set of graphs, the
>difference
>>being that the 'sandbox' was passive and 'live' active. By passive here I
>>mean that it could be queried, but no logical inference would take place,
>>unlike in the live space.
>
>Well, this is all very interesting, but I have no idea why you want
>to go to all this trouble. What problem are you solving here? Seems
>to me that the basic picture should be much simpler. RDF graphs make
>assertions (most of the time); you can take any RDF anywhere you find
>it and draw valid conclusions from it. If you are somewhat paranoid,
>you might only want to use RDF from sources you trust: but in
>general, the trustworthiness of any conclusion is at least as good as
>that of the least trustworthy assumption you derive it from.

I seem to have inadvertently led this into the world of trust.
Trustworthyness (?) wasn't really the motivation behind the idea (although
would be a bonus). You can take any RDF anywhere and draw valid conclusions
from it, but doesn't this take us back into the library without an index?

Using a sandbox/livebox combination would a means to cheaper data filtering
(more 'is this relevant?' than 'do I trust this?'). I might get exactly the
same results from my inference engine if I included assertions relating to
'Upper Respiratory Infection' and 'Rapid Deployment Force' in the knowledge
base, but if I can be more selective about the assertions made, there's less
noise to slow things down. Although I take your point about the difference
between provenance and grouping, grouping by provenance is a common
occurence, and should be useful.

I can
>see that your extra quaddish tags/colors/whatever might be one way to
>keep track of this, but there are many other possible ways.

Couple of examples?

[...]

>>Putting this more strongly, we only really have two contexts -
>'in the wild'
>>and 'in application X'.
>
>No, I think it will have to get much more delicate than that. For
>example, I might want to trust any source that is warranted as
>trustworthy by an ontology that I trust, or is owned by the US
>government. Others might have very different criteria, of course. In
>fact I forsee a whole economy of trustworthiness starting to emerge,
>eventually.

Here you are acting as application X, operating on data found in the wild.
The rules for trust (which presumably would select which statements get
asserted) aren't inherent in the RDF.

(I knew I shouldn't have mentioned the police...)

Cheers,
Danny.
Received on Thursday, 27 June 2002 17:57:38 UTC