Re: Reconciling without names

Hi Tom,

Thanks a lot! If you remember how namespaced services worked in
OpenRefine, it could be useful to devise how to offer the same sort of
functionality via the standard reconciliation API.

For me, the main hurdle is that reconciliation data must be stored in a
certain column, so if we have multiple columns for properties but no
column for the name, where should we store this information?

Best,

Antonin

On 11/10/2020 22:14, Tom Morris wrote:
> A search string definitely shouldn't be mandatory. Omitting the
> parameter or specifying the empty string for it both seem to be fine
> solutions.
>
> For historical context, the type of strong identifier reconciliation
> that's being discussed here is what was called a "Namespaced"
> reconciliation service, which had a separate endpoint, because
> Freebase didn't use properties to store identifiers. That's why you
> don't see this use case handled by the standard reconciliation
> endpoint.
>
> In Freebase, each entity just had an arbitrary number of identifiers,
> which could be used interchangeably.
> Freebase namespaces are described here:
> https://web.archive.org/web/20160526012918/http://wiki.freebase.com:80/wiki/Namespace
> but they're basically just a hierarchical path-based namespace tree
> rooted at /. Some of the interesting namespace were things like:
> - /wikipedia/en (URL slug)
> - /wikipedia/en_id (numeric id)
> - /authority/imdb
> - /authority/musicbrainz
>
> Tom
>
> On Wed, Oct 7, 2020 at 9:19 AM Antonin Delpeuch <antonin@delpeuch.eu> wrote:
>> Hi all,
>>
>> I realized today that as far as the specs are concerned, it's not too
>> hard to support this use case, so I gave it a go:
>>
>> https://github.com/reconciliation-api/specs/pull/53
>>
>> This just makes the textual "query" field optional (but requires at
>> least a property if it is not supplied).
>>
>> Best,
>>
>> Antonin
>>
>> On 11/11/2019 18:40, Shaw, Ryan wrote:
>>> Re: reconciling without names, we have another use case, one that does not involve identifiers.
>>>
>>> The PeriodO project enables reconciliation against a list of scholarly definitions of historical periods.[1] Some people want to reconcile a list of period names plus additional qualifiers (spatial coverage and temporal extent), which works with the current Refine model. But others have only those "additional" qualifiers, and no period names. Currently we can't support them without some hack of the kind Antonin describes.
>>>
>>> Ryan
>>>
>>> [1]https://github.com/periodo/periodo-reconciler#how-reconciliation-works
>>>
>>>> On Nov 10, 2019, at 6:45 AM, Antonin Delpeuch <antonin@delpeuch.eu> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I would like to discuss one use case that I think the current API does
>>>> not serve well, I think.
>>>>
>>>> The reconciliation API assumes that reconciliation queries contain at
>>>> least a name (as basic search query). It is then possible to add other
>>>> constraints (type, properties). However in many situations the user does
>>>> not have any name to supply.
>>>>
>>>> The most common situation where this is a problem is when the user has
>>>> access to a unique identifier, supported by the reconciliation service
>>>> as a property. Supplying such a unique identifier as a property in a
>>>> reconciliation query should be enough to identify the candidates, but in
>>>> the current API a name also has to be provided.
>>>>
>>>> Currently, for Wikidata reconciliation, unique identifiers have priority
>>>> over the search query, so if you find yourself in this situation you can
>>>> use any random gibberish as name. If there is an exact match via some
>>>> unique identifier that you supplied, the name will be ignored.
>>>>
>>>> Example:
>>>> {"query":"a347682ebf327cbd37e834","properties":[{"pid":"P3500","v":"53616"}]}
>>>> will return the item with Ringgold identifier "53616" even if the query
>>>> has nothing to do with its name.
>>>>
>>>> However that is not extremely intuitive or user-friendly… So in a future
>>>> version of the protocol I would be interested in making those sort of
>>>> queries more natural.
>>>>
>>>> Any thoughts about how it should work?
>>>>
>>>> Cheers,
>>>> Antonin
>>>>
>>>>

Received on Friday, 16 October 2020 13:18:30 UTC