W3C home > Mailing lists > Public > semantic-web@w3.org > March 2019

RE: Solomon''s curse and search Bias - a possible solution

From: <hans.teijgeler@quicknet.nl>
Date: Sun, 3 Mar 2019 13:56:56 +0100
To: "'Frank Manola'" <fmanola@verizon.net>, "'Paola Di Maio'" <paoladimaio10@gmail.com>
Cc: "'Carl Wimmer'" <carl@correlationconcepts.com>, "'SW-forum'" <semantic-web@w3.org>
Message-ID: <004901d4d1c0$97fb93c0$c7f2bb40$@quicknet.nl>
+1  good enough for me:



From: Frank Manola <fmanola@verizon.net> 
Sent: zondag 3 maart 2019 12:29
To: Paola Di Maio <paoladimaio10@gmail.com>
Cc: Carl Wimmer <carl@correlationconcepts.com>; SW-forum <semantic-web@w3.org>
Subject: Re: Solomon''s curse and search Bias - a possible solution


You can certainly ask for whatever new technology you want, but characterizing existing search capabilities as “very wrong”, “unbelievably misleading”, “not acceptable”, and “very poor performance” seems a tad over the top.  When I ask Google for “mcdonalds near me”, I’m content not to have Google tell me that Old McDonald’s farm is across the street.

Sent from my iPad

On Mar 3, 2019, at 5:35 AM, Paola Di Maio <paoladimaio10@gmail.com> wrote:

Thank you Frank (and Penaloza)-


well the future generations are one concern - learning critical thinking and disambiguation is

fundamental, but does not answer nor solve the problem of something is very wrong with the technology

*I assume I just stumbled upon an instance of a known problem, sorry if this is no news



It seems to me one solution to your problem is to teach future generations how to do disambiguation (which applies to a lot more than just searches).  


yes  definitely!!   at the same time.....


Quite advanced Knowledge Representation and Information Retrieval  mechanisms exist that can support more meaningfully sorted  web search results-  to point to one derative result (or a set of results) completely excluding the source of the original concept is unbelievably misleading and not acceptable in technology terms


pardon me if this is obvious to everyone, but I had not realised that information retrieval is getting worse

in this sense. how can this be?


And in the old days if you’d gone to a librarian with your search, the librarian would certainly have asked for more search terms (“Do you mean the book or the king in the Bible?”).  


Sure for every term/concept there may be the need for some disambiguation 

(book, edition, version, issue ,etc)


But 'if I enter the search term 'planet earth'  there is no reason not to expect adequate pointers to the main entity

not only to corresponding brand names and everything else that has the same name



If we are worried about future generations, the solution is not to change the technology to fix all problems (impossible), but rather to teach critical thinking and reading and searching skills. 


We may not be able to fix all problems but as web engineers and scientists, it is our duty to develop  systems

as accurate as possible . The problem here is not the technology, which is perfectly capable of sorting information - but how and why it is deployed so poorly to create disinformation


Critical thinking skills are necessary but this is no justification for very poor performance


Let's keep on increasing awareness about the bias in our lives






Sent from my iPad

On Mar 3, 2019, at 2:53 AM, Paola Di Maio <paola.dimaio@gmail.com <mailto:paola.dimaio@gmail.com> > wrote:


thanks for reply-



On Sun, Mar 3, 2019 at 2:47 PM Carl Wimmer <carl@correlationconcepts.com <mailto:carl@correlationconcepts.com> > wrote:. 

Someone has already made a decision as to what is important and what can be ignored. The practical result for you is that any elements already deemed unimportant are instantly invisible to you, with no way of bringing them to the surface.


well, in this case, who would be 'someone' ?  Isnt it an algortirmic bias? 

You used Google. Now, of course, Google can never be accused of filtering any results to their own benefit. That would be unthinkable. Usurping results for some self serving economic or political gain is dastardly and I am sure they would never do it.

what search engine do you use


What is required is a complete shift from Search (index and heuristics) as a means of addressing information.


sounds interesting but I d be tempted to build that on top of index based search, rather than attempting to replace it


The first requirement is a system that can assemble a complete set of possibilities in response to any query, simple or complex.


ok- I accept that - at the same time we have spent several decades attempting to find some

agreement of what we can consider ''complete''  taking the universe as the top set


what about  ... a systemt that can assemble an economically computable . configurable and  transparently accountable set of possibilitis.....  

Now by that I mean: do show the book entitled solomon curse in the results, but POINT to 

where the book got it name from.... this kind of proventance/traceablity can be easily automated in todays web

no?  for example : search result 1.>>>>>relation to>>>>result2 etc, where relation can be anything from

'sounds like'' to ''its a parody of' to inspired by   etc

The second requirement is that the user can select from a list (hopefully a very long list) of filtering tools to derive truth from connections/possibilities.


He or she might wish to see the query results from a variety of viewpoints to gain perspective.

Let me give an example to illustrate:

Two facts are in evidence ... 

The Alpha Motor Car Company made 100 million in profit last year .... and ...

they fired 5,000 workers.

Now comes the viewpoints to interpret the two fact.

From the worker's union point of view (schema) ... those bastards, they made a hundred million and they fired 5,000 of the guys that made that profit possible for them.

From the shareholder's  point of view (schema) ... we only made 100 million on all that investment, .. fire 5,000 more workers.

From management's point of view (schema) ... Well, how we managed to make any cars at all at the outrageous wages demanded by the union is a miracle. The only reason we were able to sell any of those cars was because we surrendered to the low offers made by the customers, squeezing us from the top. We managed to get some designs for products for next year and we ground out 100 million in profit.

I like viewpoints but....  needs some work to implement them in the open web.... assuming there shall be one.....


Not as good as Toyota down the street but better than GM up the block. All in all, not a bad year.

Now you see the framework for the solution to your problem. The schemas are not used to derive the possibilities (that has to be done by a new system of addressing information) but they are used to sort and qualify the results from as many different points of view as possible to gain real perspective.

viewpoints are a technical standard which could be one way to solve this bias



thanks Carl 

Good question

On 3/2/2019 9:07 PM, Paola Di Maio wrote:


I wanted to share a concern, as I know posts gets read and issued picked up and addressed in time


I searched Google today for Solomon Curse, trying to find some references to some historical cause and conditions in the first house of David - not in relation to a specific race, but more in relation to the history of the modern world

to see if anyone is following up the courses and recourses of history 




Well, I was shocked to see that the first page of results were all about a book and its author, and nothing

about history came up at all.  I had to add additional words to create some context to dig up some 

historical references.


Just wanted to point out that I am very concerned about future generations receiving a distorted

version of history by heavily commercially biased search results when typing some search terms and

getting only/mostly the results from one entity, rather than a representation of the plurality of meanings and contexts


Bias is a known problem in searches, however I was hoping that by now we would have

some mechanisms to reduce this bias? Doesn't look like it.


I hope that schema.org <http://schema.org>  could help that by creating metaschemas for disambiguation

or other mechanism, such a representation of context which should include at least

two perspectives: the domain a search term is present, and the time/chronology (to show which came first)


 Just a sunday morning note before digging in more confusing knowledge from search results





Virus-free.  <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link> www.avast.com 


Deze e-mail is gecontroleerd op virussen door AVG.

(image/png attachment: image001.png)

Received on Sunday, 3 March 2019 12:57:32 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 3 March 2019 12:57:33 UTC