Re: Meaning of identifierSpace and schemaSpace from David Huynh on 2019-10-01 (public-reconciliation@w3.org from October 2019)

From: David Huynh <dfhuynh@gmail.com>
Date: Mon, 30 Sep 2019 19:13:49 -0700
To: Thad Guidry <thadguidry@gmail.com>
Cc: David Newbury <DNewbury@getty.edu>, Antonin Delpeuch <antonin@delpeuch.eu>, "public-reconciliation@w3.org" <public-reconciliation@w3.org>
Message-ID: <CAH5JDHeCyur+sNtHM4FFThs0kzp5333fMHH_3ENukK0OzpiDEw@mail.gmail.com>

If I recall correctly, the original intention is to provide a hook not for
extensibility per se, but for future formalization.

Freebase IDs did not follow the URI syntax, and in order to also
accommodate other data stores' ID schemes (which don't necessarily follow
the URI syntax, either), I left the syntax for instance IDs and type IDs
unspecified in the API, meaning "anything goes." So, "/people/person" is a
fine schema ID and "/m/29xy1" is a fine instance ID even if they don't have
the URI syntax.

However, I didn't want to leave their syntax entirely "anything goes",
either, so I introduced those 2 fields identifierSpace and schemaSpace as
references to whatever resources that can specify what the syntax for the
instance IDs and type IDs is. Values filling those 2 fields are required to
be URIs.

I think at this time, the Recon API can be refined further to formalize
what those URIs should provide, e.g., some XML documents describing the
syntax of instance IDs and type IDs. I did not really know what those URIs
should provide 10 years ago, but perhaps now, the community has had enough
experience with many existing recon services that we can do this
formalization.

Thanks,

David

On Mon, Sep 30, 2019 at 10:09 AM Thad Guidry <thadguidry@gmail.com> wrote:

> Hi David Newbury,
>
> OpenRefine does not process these IDs currently.  But users are certainly
> able to do things with the fields, even if they manually form GET requests
> to Recon endpoints.  I am not sure how much metadata is exposed currently
> that users are able to interact with in OpenRefine for Recon services.
> "cell" in a Recon object shows all the fields currently exposed.  Perhaps
> Antonin can summarize that for us.
>
> The idea of the IDs for these 2 fields is that the IDs sometimes hold
> information...like metadata in a way.  They become useful I would say for
> collaborative efforts sometimes, as is often the case with Linked Data.
> Bits of info and hints of categories, domains, general classifications
> sometimes show up in those URI or IDs, as they did with Freebase in
> particular.
>
> My definitions for these 2 fields in Reconciliation API would begin to
> look like this...
>
> schemaSpace :  A URI or ID that represents a group of entities by some
> Domain or Schema, ex: "/usa/people/person", or "006_electronic_journals" or
> simply "football"
>
> identifierSpace: A URI or ID that represents a group??? of entities by
> some ???
>
> Thad
> https://www.linkedin.com/in/thadguidry/
>
>
> On Mon, Sep 30, 2019 at 10:26 AM David Newbury <DNewbury@getty.edu> wrote:
>
>> Hi all—been lurking here a bit and have been encouraged to jump in and
>> participate.
>>
>>
>>
>> I agree with the “Why” question.  At the Getty, we’re using links to our
>> documentation for our URL structure and our type structure, but that’s
>> pretty arbitrary. (Our enpoint is at
>> http://services.getty.edu/vocab/reconcile).  We had a discussion about
>> what to put there, and in the absence of a rationale, we put what we
>> thought would be most helpful to a user trying to understand our schema
>> space.
>>
>>
>>
>> The questions I would ask, to help understand what goes here would be:
>> Does OpenRefine actually process these IDs in any way?  Are we aware of any
>> client that does?  And is the point here to just provide data hooks for
>> extensibility, or is there anything else it does?
>>
>>
>>
>>
>>
>>
>>
>> — David
>>
>>
>>
>> David Newbury
>>
>> Enterprise Software Architect, Getty Digital
>>
>>
>>
>> Email: dnewbury@getty.edu
>>
>> Phone: (310) 440-6116
>>
>>
>>
>>

Received on Tuesday, 1 October 2019 02:14:24 UTC