Re: Producting explainable reconciliation score from Ivan Bashkirov on 2020-07-14 (public-reconciliation@w3.org from July 2020)

From: Ivan Bashkirov <ivan.bashkirov@opencorporates.com>
Date: Wed, 15 Jul 2020 00:09:46 +0100
To: Antonin Delpeuch <antonin@delpeuch.eu>
Cc: public-reconciliation@w3.org
Message-ID: <CAD_fveP7_-mUZE2a+orbLehxS=yXShC9mgUP-uK+4bzFvezmuw@mail.gmail.com>
Thanks Fabian for pointing to the code, we are using somewhat similar
approach, where we are calculating the score, and then the match/no match
decision is in part influenced by relationship between top scores. One
thing we are trying to solve though is to make it clear how the score is
calculated. Antonin, I had a quick look at that thread and exposing
features looks like a promising approach. I'll try to comment on it once I
have a closer look at the pr. Thanks again for everyone's input!

Ivan

On Tue, 14 Jul 2020 at 23:35, Antonin Delpeuch <antonin@delpeuch.eu> wrote:

> Hi all,
>
> Thank you for bringing up this very important topic!
>
> I proposed a while back to make it possible for services to expose the
> reconciliation features that they use to cook up their final score:
>
> https://github.com/reconciliation-api/specs/pull/38
>
> This does not actually require any sort of relationship between the
> existing global score and these features (the service may still come up
> with candidate scores however it wishes, it does not have to be a linear
> combination of the features it returns).
>
> Do you think this goes in the right direction? I have been reluctant to
> merge this only with Fabian's green light, it would be great to have
> more voices there.
>
> Antonin
>
> On 13/07/2020 11:41, Fabian Steeg wrote:
> > Hi Ivan,
> >
> > probably not very interesting for you, since it sounds like you have
> > the same approach, we also use weighted fields for reconciling names
> > (in https://lobid.org/gnd/reconcile). This all happens to be on the
> > Elasticsearch query level in our service, meaning the weighting
> > happens in the query, see:
> >
> >
> https://github.com/hbz/lobid-gnd/blob/a9bba80a23e26a4c812964424b6c89457e4a3103/app/controllers/Reconcile.java#L451
> >
> >
> > The score is then passed from Elasticsearch to the Reconciliation
> > client. So in our case the Elasticsearch explain API you mentioned
> > could actually explain how the Reconciliation score was determined. We
> > never considered passing that to the user, but that might be useful.
> >
> > Cheers,
> > Fabian
> >
> > On 7/13/20 8:38 AM, Ivan Bashkirov wrote:
> >> Thank you for the responses so far. Tom, you are right about that
> >> Elasticsearch endpoint being about normalization pipeline rather than
> >> scoring, I sent the wrong one! They have "explain api"
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-explain.html
>  which
> >> is what I had in mind. It breaks up the score into individual
> >> components so you can trace how they arrived at the final number.
> >>
> >> Ivan
> >>
> >> On Mon, 13 Jul 2020 at 02:51, Tom Morris <tfmorris@gmail.com
> >> <mailto:tfmorris@gmail.com>> wrote:
> >>
> >>     Hi Ivan,
> >>
> >>     On Sun, Jul 12, 2020 at 10:00 AM Ivan Bashkirov
> >>     <ivan.bashkirov@opencorporates.com
> >>     <mailto:ivan.bashkirov@opencorporates.com>> wrote:
> >>
> >>         Hi all, I have a question about approaches services are using to
> >>         produce a reconciliation score that is meaningful to the end
> >> users.
> >>
> >>         Crucially, we want the users to know why the score is what it
> >>         is, and how they can make it better.  As I understand, most
> >>         reconciliation services produce a somewhat abstract score from 0
> >>         to 100 that roughly translates as "confidence", or "probability"
> >>         that the result is the one a user is looking for. It would be
> >>         great to hear what strategies people are using to produce the
> >>         score.
> >>     ...
> >>
> >>         In our case, we are doing company entity reconciliation.. We are
> >>         experimenting with parameters that include company name (score
> >>         varies depending on how closely the query string is matching the
> >>         candidate), address, active/inactive status, whether a company
> >>         is a branch or not and so on. Each parameter has a weighting and
> >>         the final score is more or less a weighted sum of those.
> >>
> >>
> >>     A weighted/scaled distance metric is pretty typical. Obviously the
> >>     weights are of critical importance. I think there are a few
> >>     different things that it's valuable to convey to the user, if
> >> possible:
> >>
> >>       * Ranking of the returned choices - this only depends on relative
> >>         scores, not their absolute values
> >>       * Confusable candidates - it's valuable if the relative scores
> >>         help distinguish cases that might require more careful checking
> >>         from those that can be automatically trusted
> >>       * Low quality candidates - it's valuable to have some type of
> >>         threshold, whether it be fixed or something that the users learn
> >>         based on their experience..
> >>         Finally, as far as I can see there is nothing in Reconciliation
> >>         API that offers score explainability. Of course documentation
> >>         for each particular reconciliation service would likely be the
> >>         primary machanism of explaining how the score is produced. But
> >>         I'm wondering if there is value of baking something like that
> >>         directly into Reconciliation API.. Has this been discussed? I am
> >>         getting inspiration from Elasticsearch `_analyze` endpoint which
> >>         produces a breakdown of the score.
> >>
> >>
> https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html
> >>
> >>
> >>     I read that as more an explanation of the
> >>     transformation/normalization pipeline than the scoring mechanism.
> >>     This makes sense for Elasticsearch, because you can construct chains
> >>     of transformations which are hidden in the background. For scoring
> >>     however, they take a different approach and put the power in the
> >>     users hand by allowing them to construct complex queries with their
> >>     own weighting algorithms embedded. I suspect that's too
> >>     sophisticated for most of the users of reconciliation services, but
> >>     perhaps there are simple controls like choosing among exact, prefix,
> >>     and approximate string matches, etc.
> >>
> >>     I'll be interested to hear the kinds of scoring metrics people have
> >>     implemented. My gut feeling is that most of them are pretty basic.
> >>
> >>     Tom
> >>
> >
>
>
>
Received on Tuesday, 14 July 2020 23:18:54 UTC