Re: Meaning of identifierSpace and schemaSpace

Hi David,

Thank you very much for chiming in!

In our specs, I think it would be useful if we could formulate the
definition of these parameters without any reference to Freebase. So
perhaps something fairly generic, such as "This pair of URI defines the
identifier scheme used by the reconciliation service.", or something
along these lines?

Antonin

On 26/09/2019 03:50, David Huynh wrote:
> Thad et al.,
>
> My memory is also fading, but as far as I can recall, the intention is
> that, since Freebase's identifier scheme (for both instances and
> schemas) is only one possible way of formulating identifiers, we have
> to allow for other identifier schemes in order to make the
> reconciliation API extensible and future-proof. identifierSpace and
> schemaSpace specify what schemes instance identifiers and schema
> identifiers follow so that software like OpenRefine can
> programmatically process those identifiers. For example, given a type
> ID in the Freebase's schema ID space, e.g. /people/person, OpenRefine
> can remove the last segment including the last slash / and infer the
> ID of the domain containing that type, namely /people. Other
> identifier schemes may or may not have such path-based syntax.
>
> Let me know if that makes sense.
>
> Thanks,
>
> David
>
> On Wed, Sep 25, 2019 at 4:57 AM Thad Guidry <thadguidry@gmail.com
> <mailto:thadguidry@gmail.com>> wrote:
>
>     David,
>
>     (sorry for pulling you into this convo)
>
>     Do you happen to have any additional knowledge or clarification
>     for the definitions of identiferSpace and schemaSpace in
>     Reconciliation ? (my memory is fading fast)
>
>     Thad
>     https://www.linkedin.com/in/thadguidry/
>
>
>     On Tue, Sep 24, 2019 at 9:07 AM Thad Guidry <thadguidry@gmail.com
>     <mailto:thadguidry@gmail.com>> wrote:
>
>         Historically in Freebase the Types themselves were grouped
>         under Domains sometimes.
>
>         We had Domains like /business and /music etc.
>         and each Type in the Domain (like "artist") had an ID that
>         showed which domain it was under like this Type "/music/artist".
>
>         Then the more general concept was namespaces and keys
>         https://developers.google.com/freebase/guide/basic_concepts 
>
>         Wikidata Items directly translate to Freebase Topic MIDs (
>         like /m/0dgw9r <https://tools.wmflabs.org/freebase/m/0dgw9r> )
>
>         OId MQL ... but this might help you understand... When you
>         inserted a new value into Freebase you would get back the
>         identifier MID of the triple, in this case upon inserting the
>         key value "eol" into the "/biology" namespace (which is
>         essentially a Domain)
>
>         [{
>           "id":   "/m/0cnstys",
>           "type": "/type/namespace",
>           "key": [{
>             "connect":   "insert",
>             "namespace": "/biology",
>             "value":     "eol"
>           }]
>         }] 
>
>         https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority|sort:date/freebase-discuss/qiKeBzTMsks/rDtwPUNDClQJ
>         <https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority%7Csort:date/freebase-discuss/qiKeBzTMsks/rDtwPUNDClQJ> 
>         *
>         *
>         https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority|sort:date/freebase-discuss/WdwYwZKSLtM/6Qc5EosrG28J
>         <https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority%7Csort:date/freebase-discuss/WdwYwZKSLtM/6Qc5EosrG28J> 
>          
>         https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority|sort:date/freebase-discuss/OoRk6BsxWyQ/jCXs-PSn3woJ
>         <https://groups.google.com/forum/#!searchin/freebase-discuss/jeff$20prucher$20authority%7Csort:date/freebase-discuss/OoRk6BsxWyQ/jCXs-PSn3woJ> 
>
>         https://groups.google.com/forum/#!searchin/freebase-discuss/mql$20namespace|sort:date/freebase-discuss/KMQRKrcTYck/zPKBnyncOj4J
>         <https://groups.google.com/forum/#!searchin/freebase-discuss/mql$20namespace%7Csort:date/freebase-discuss/KMQRKrcTYck/zPKBnyncOj4J> 
>          
>         *My initial reaction is that the schemaSpace is more about
>         Domains and that the identifierSpace is about namespaces and
>         keys* (where we stored unique identifiers and was the old soft
>         key system).  
>         Since Wikidata has no direct Types or Domains, there is
>         nothing to translate there I think.
>
>         Thad
>         https://www.linkedin.com/in/thadguidry/
>
>
>         On Tue, Sep 24, 2019 at 5:15 AM Antonin Delpeuch
>         <antonin@delpeuch.eu <mailto:antonin@delpeuch.eu>> wrote:
>
>             Hi all,
>
>             I have a question about the exact meaning of the
>             identifierSpace and
>             schemaSpace fields in reconciliation services. This is one
>             thing I would
>             like document better in the specifications.
>
>             My understanding is that these are URIs which represent
>             the type of
>             entities and properties used by the service. Can we have a
>             more precise
>             definition than that?
>
>             The canonical value for both of these is
>             "http://rdf.freebase.com/ns/type.object.id", which was
>             used in the
>             Freebase reconciliation service, but can of course no
>             longer be accessed
>             (if it ever was? sometimes URIs are not meant to be
>             resolved in a browser).
>
>             This canonical value is used by many other reconciliation
>             services, even
>             if they do not use Freebase ids at all: for instance, the
>             OCCRP endpoint
>             uses it. Is it desirable?
>
>             In the Wikidata service I have used
>             "http://www.wikidata.org/entity/" as
>             identifierSpace and "http://www.wikidata.org/prop/direct/" as
>             schemaSpace. The idea behind this choice was that you can
>             get a RDF URI
>             for identifiers and properties by concatenating their ids
>             to these
>             prefixes. But that was totally a guess on my part. Perhaps
>             I should have
>             used actual URIs there? Which ones?
>
>             So, in short, I would like a precise definition of the
>             identifierSpace
>             and schemaSpace fields, which would unambiguously inform
>             implementers
>             about what value they should have in their own services.
>
>             Thanks,
>
>             Antonin
>
>
>

Received on Thursday, 26 September 2019 07:04:13 UTC