Re: Homework and meeting tomorrow from Daniel Schwabe on 2020-04-01 (public-credibility@w3.org from April 2020)

From: Daniel Schwabe <dschwabe@gmail.com>
Date: Wed, 1 Apr 2020 14:34:46 -0300
To: Sandro Hawke <sandro@w3.org>
Cc: Credible Web CG <public-credibility@w3.org>
Message-Id: <6ADD955A-40E9-4297-B8F6-A107F02603DA@gmail.com>

Hi Sandro,
my concern with this approach is that the numbers don’t really have any meaning… How different/significant is a weight of 30 vs 40? 45? etc…

Many years ago we published a paper in WWW 2004 (https://www.researchgate.net/publication/221023321_A_hybrid_approach_for_searching_in_the_semantic_web <https://www.researchgate.net/publication/221023321_A_hybrid_approach_for_searching_in_the_semantic_web>), in which we used some weights in a schema graph (T-Box in OWL terms, for those who know) to adjust search results in an instance graph (A-box), using a spread activation algorithm. We got some good results. In assigning weights to the schema graph, we first asked “experts” to try to assign weights to each relation in the schema according to what they thought would be the relation’s “relevance”. We also defined a computed measure based on measures of uniqueness (analogous to tf and ida for terms in classical IR) of each relation in the schema. The computed measures performed much better than human assigned weights. Notice that in this case we could define a baseline measure to compare with.

Our conclusion was that humans are simply not capable of assigning such quantitate measures, simply because these numbers have no real meaning. The best they can be used for is defining some (partial) ordering of importance among relations, which can also be done without numbers. Also we don’t have any theoretical support for combining such numbers, they become simply mathematical formulae that are impossible to interpret.

I have my doubts about applying a similar approach to phenomena such as trust and reputation, which are inherently subjective (and this brings also another set of issues as well).

We can elaborate more during our meeting, if you think appropriate.

Best
D
—

Daniel Schwabe                      Dept. de Informatica, PUC-Rio
Tel:+55-21-3527 1500 r. 4356        R. M. de S. Vicente, 225
Fax: +55-21-3527 1530               Rio de Janeiro, RJ 22453-900, Brasil
http://www.inf.puc-rio.br/~dschwabe

> On 31 Mar 2020, at 14:55, Sandro Hawke <sandro@w3.org> wrote:
> 
> I made a tool for playing with credibility networks. It's not done, but it's already interesting: https://credweb.org/viewer/ <https://credweb.org/viewer/>
> 
> <jbckcjpbljpmkgdc.png>
> 
> I'd love to feed it some real data. It would help if people (that means you) would make some public credibility statements. I did some here: Sandro - Contributions to Public Information <https://docs.google.com/document/d/1ShiK_Pkd46foPbWCayfkUh3UV5Bhd5KHI5SKYoUIkiI/edit>.  Feel free to use that as a starting point, copying and editing. When you've got even a couple statements, make it public and send me an email so I can link to it. I think perhaps the most interesting part of this is to dig into why people who are in general agreement might disagree about the credibility of sources, and how folks should respond when that happens.
> 
> Meeting tomorrow at the usual time 1 April 2020 2pm ET <https://www.timeanddate.com/worldclock/fixedtime.html?msg=CredWeb&iso=20200401T14&p1=43&ah=1>  and the usual place https://zoom.us/j/706868147 <https://zoom.us/j/706868147> to talk more about this stuff. Agenda <https://docs.google.com/document/d/1SAH4u21D16oGtP2CVKnxgd6h4gGcSyJxHsuGtolujpM/edit>.
> 
> Thanks!
> 
>       -- Sandro
> 
> 
>

Received on Wednesday, 1 April 2020 17:35:07 UTC