Re: Trust in statements (still is BioRDF Brainstorming) from Adrian Walker on 2008-02-14 (public-semweb-lifesci@w3.org from February 2008)

From: Adrian Walker <adriandwalker@gmail.com>
Date: Wed, 13 Feb 2008 19:37:55 -0500
To: "Chris Mungall" <cjm@fruitfly.org>
Cc: public-semweb-lifesci@w3.org
Message-ID: <1e89d6a40802131637n312bc14ai6fa7ca91a19c8253@mail.gmail.com>
HI Chris --

You wrote...

 I think the only option here is to embrace rdf-reification (and to push for
better syntax, query and tool support).

Would the approach in question 8 of

   www.reengineeringllc.com/demo_agents/RDFQueryLangComparison1.agent

be useful ?

We map such queries automatically to SQL, and presumably the corresponding
SPARQL would be similar.

HTH,   -- Adrian

Internet Business Logic
A Wiki and SOA Endpoint for Executable Open Vocabulary English
Online at www.reengineeringllc.com    Shared use is free

Adrian Walker
Reengineering



On 2/13/08, Chris Mungall <cjm@fruitfly.org> wrote:
>
>
>
> On Feb 13, 2008, at 2:14 PM, M. Scott Marshall wrote:
>
> >
> > Dear Matt,
> >
> > I see 'trust' as a 'view' that can be produced by running a filter
> > over
> > the data (provenance). The filter would implement my trust policy, or
> > one of them. In other words, my trust in a given 'agent' can be due to
> > the fact that it produces data using a certain algorithm. I also
> > place a
> > certain level of trust in the instrumentation that produced the data,
> > the p-values of an analysis in the processing pipeline, human
> > operators
> > involved, etc. So, the weights or confidence measures that you are
> > describing and that Alan is qualifying would be the *output* of such a
> > trust policy or filter. I would not besmirch the data with my own
> > personal trust models nor easily trust those of others. ;) I guess
> > that
> > what I'm trying to say is equivalent to Alan's point: I would
> > prefer to
> > keep facts and their evidence disclosed symbolically in the data so
> > that
> > different 'views' can take them into account.
> >
> > But, before I go to build such 'views' or filters, I will wait for
> > that
> > sort of information to become machine-readable as data provenance. :)
> >
> > However, I *can* try to make that sort of information available for
> > data
> > that I am helping to manage or produce. It seems that having a triple
> > store (such as Virtuoso) with named graph support would make it
> > possible
> > to produce several types of potentially useful data provenence.
>
> The problem with NGs (and especially with existing RDF support) is
> the close coupling between provenance and the URI from which the
> triples were obtained.
>
> If I wish to make available a collection of triples t1...tn where
> each triple has its own provenance information tp1...tpn then I have
> to have n URIs. If I serve up those triples through a SPARQL endpoint
> then the act of creating a new graph will lose all the original NG
> information.
>
> NGs are not directly supported in the RDF model and it's not clear
> how NGs would be accessed from an OWL-level API such as the OWLAPI.
>
> There are proposed extensions such as Trix/Trig - and there may be
> some relation between NGs and quoting in N3. However, AFAIK the
> meaning of these extensions in the OWL-DL formalism is not clear.
>
> I don't think NGs are so useful beyond SPARQL. I think the only
> option here is to embrace rdf-reification (and to push for better
> syntax, query and tool support). After all, this is how provenance at
> the OWL level will work in OWL1.1 (i.e. annotating axioms)
>
> > -scott
> >
> > --
> > M. Scott Marshall
> > http://staff.science.uva.nl/~marshall
> > http://adaptivedisclosure.org
> >
> > Matt Williams wrote:
> >>
> >> Dear Alan,
> >>
> >> Thank you for making my point much more clearly than I managed. I'm a
> >> little wary of probabilities in situations like the one you
> >> describe, as
> >> it always seems a little hard to pin down what is meant by them. At
> >> least with the symbolic approach, you can give a short paragraph
> >> saying
> >> what you mean.
> >>
> >> I'll try and find a paper on the "p-modals" (possible, probable,
> >> etc.)
> >> and ways of combining them tomorrow and put a paragraph on the wiki.
> >>
> >> Matt
> >>
> >> Alan Ruttenberg wrote:
> >>> I'm personally fond of the symbolic approach - I think it is more
> >>> direct and easier to explain what is meant. It's harder to align
> >>> people to a numerical system, I would think, and also provides a
> >>> false
> >>> sense of precision. Explanations are easier to understand as
> >>> well: "2
> >>> sources thought this probable, and 1 thought is doubtful" can be
> >>> grokked more easily than score: 70%
> >>>
> >>> -Alan
> >>>
> >>> On Feb 12, 2008, at 4:03 PM, Matt Williams wrote:
> >>>
> >>>>
> >>>> Just a quick note that the 'trust' we place in an agent /could/ be
> >>>> described probabilistically, but could also be described logically.
> >>>> I'm assuming that the probabilities that the trust annotations are
> >>>> likely to subjective probabilities (as we're unlikely to have
> >>>> enough
> >>>> data to generate objective probabilities for the degree of trust).
> >>>>
> >>>> If you ask people to annotate with probabilities, the next thing
> >>>> you
> >>>> might want to do is to define a set of common probabilities (10
> >>>> - 90,
> >>>> in 10% increments, for example).
> >>>>
> >>>> The alternative is that one could annotate a source, or agent, with
> >>>> our degree of belief, chosen from some dictionary of options
> >>>> (probable, possible, doubtful, implausible, etc.).
> >>>>
> >>>> Although there are some formal differences, the two approaches
> >>>> end up
> >>>> as something very similar. There is of course a great deal of
> >>>> work on
> >>>> managing conflicting annotations and levels of belief in the
> >>>> literature.
> >>>>
> >>>> Matt
> >>>>
> >>>> --http://acl.icnet.uk/~mw
> >>>> http://adhominem.blogsome.com/
> >>>> +44 (0)7834 899570
> >>>>
> >>>
> >>
> >
> >
> >
> >
>
>
>
Received on Thursday, 14 February 2008 00:38:16 UTC