Re: Meaning of identifierSpace and schemaSpace

Hi Antonin,

That sounds good to me, though the reader might still ponder, "why?" So
maybe another sentence or two would provide a rationale. We could also note
that while identifierSpace and schemaSpace are URIs, the IDs in those
spaces (such as Freebase IDs) may not be URIs.

Best,

David


On Thu, Sep 26, 2019 at 12:03 AM Antonin Delpeuch <antonin@delpeuch.eu>
wrote:

> Hi David,
>
> Thank you very much for chiming in!
>
> In our specs, I think it would be useful if we could formulate the
> definition of these parameters without any reference to Freebase. So
> perhaps something fairly generic, such as "This pair of URI defines the
> identifier scheme used by the reconciliation service.", or something along
> these lines?
>
> Antonin
> On 26/09/2019 03:50, David Huynh wrote:
>
> Thad et al.,
>
> My memory is also fading, but as far as I can recall, the intention is
> that, since Freebase's identifier scheme (for both instances and schemas)
> is only one possible way of formulating identifiers, we have to allow for
> other identifier schemes in order to make the reconciliation API extensible
> and future-proof. identifierSpace and schemaSpace specify what schemes
> instance identifiers and schema identifiers follow so that software like
> OpenRefine can programmatically process those identifiers. For example,
> given a type ID in the Freebase's schema ID space, e.g. /people/person,
> OpenRefine can remove the last segment including the last slash / and infer
> the ID of the domain containing that type, namely /people. Other identifier
> schemes may or may not have such path-based syntax.
>
> Let me know if that makes sense.
>
> Thanks,
>
> David
>
> On Wed, Sep 25, 2019 at 4:57 AM Thad Guidry <thadguidry@gmail.com> wrote:
>
>> David,
>>
>> (sorry for pulling you into this convo)
>>
>> Do you happen to have any additional knowledge or clarification for the
>> definitions of identiferSpace and schemaSpace in Reconciliation ? (my
>> memory is fading fast)
>>
>> Thad
>> https://www.linkedin.com/in/thadguidry/
>>
>>
>> On Tue, Sep 24, 2019 at 9:07 AM Thad Guidry <thadguidry@gmail.com> wrote:
>>
>>> Historically in Freebase the Types themselves were grouped under Domains
>>> sometimes.
>>>
>>> We had Domains like /business and /music etc.
>>> and each Type in the Domain (like "artist") had an ID that showed which
>>> domain it was under like this Type "/music/artist".
>>>
>>> Then the more general concept was namespaces and keys
>>> https://developers.google.com/freebase/guide/basic_concepts
>>>
>>> Wikidata Items directly translate to Freebase Topic MIDs ( like
>>> /m/0dgw9r <https://tools.wmflabs.org/freebase/m/0dgw9r> )
>>>
>>> OId MQL ... but this might help you understand... When you inserted a
>>> new value into Freebase you would get back the identifier MID of the
>>> triple, in this case upon inserting the key value "eol" into the "/biology"
>>> namespace (which is essentially a Domain)
>>>
>>> [{
>>>   "id":   "/m/0cnstys",
>>>   "type": "/type/namespace",
>>>   "key": [{
>>>     "connect":   "insert",
>>>     "namespace": "/biology",
>>>     "value":     "eol"
>>>   }]
>>> }]
>>>
>>>
>>> https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority|sort:date/freebase-discuss/qiKeBzTMsks/rDtwPUNDClQJ
>>>
>>>
>>>
>>> https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority|sort:date/freebase-discuss/WdwYwZKSLtM/6Qc5EosrG28J
>>>
>>>
>>>
>>> https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority|sort:date/freebase-discuss/OoRk6BsxWyQ/jCXs-PSn3woJ
>>>
>>>
>>>
>>> https://groups.google.com/forum/#!searchin/freebase-discuss/mql$20namespace|sort:date/freebase-discuss/KMQRKrcTYck/zPKBnyncOj4J
>>>
>>>
>>> *My initial reaction is that the schemaSpace is more about Domains and
>>> that the identifierSpace is about namespaces and keys* (where we stored
>>> unique identifiers and was the old soft key system).
>>> Since Wikidata has no direct Types or Domains, there is nothing to
>>> translate there I think.
>>>
>>> Thad
>>> https://www.linkedin.com/in/thadguidry/
>>>
>>>
>>> On Tue, Sep 24, 2019 at 5:15 AM Antonin Delpeuch <antonin@delpeuch.eu>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I have a question about the exact meaning of the identifierSpace and
>>>> schemaSpace fields in reconciliation services. This is one thing I would
>>>> like document better in the specifications.
>>>>
>>>> My understanding is that these are URIs which represent the type of
>>>> entities and properties used by the service. Can we have a more precise
>>>> definition than that?
>>>>
>>>> The canonical value for both of these is
>>>> "http://rdf.freebase.com/ns/type.object.id", which was used in the
>>>> Freebase reconciliation service, but can of course no longer be accessed
>>>> (if it ever was? sometimes URIs are not meant to be resolved in a
>>>> browser).
>>>>
>>>> This canonical value is used by many other reconciliation services, even
>>>> if they do not use Freebase ids at all: for instance, the OCCRP endpoint
>>>> uses it. Is it desirable?
>>>>
>>>> In the Wikidata service I have used "http://www.wikidata.org/entity/"
>>>> as
>>>> identifierSpace and "http://www.wikidata.org/prop/direct/" as
>>>> schemaSpace. The idea behind this choice was that you can get a RDF URI
>>>> for identifiers and properties by concatenating their ids to these
>>>> prefixes. But that was totally a guess on my part. Perhaps I should have
>>>> used actual URIs there? Which ones?
>>>>
>>>> So, in short, I would like a precise definition of the identifierSpace
>>>> and schemaSpace fields, which would unambiguously inform implementers
>>>> about what value they should have in their own services.
>>>>
>>>> Thanks,
>>>>
>>>> Antonin
>>>>
>>>>
>>>>
>>>>

Received on Thursday, 26 September 2019 23:33:54 UTC