Re: Schema.org, Enumerations and Wikidata proposal

On 11 April 2017 at 18:16, R.V.Guha <guha@guha.com> wrote:
> There are many 'lists' or 'enumerations' (such as the list of countries,
> languages, ...). Often, markup wants to refer to items from such lists.
> Since we have the corresponding types, pages can refer to them via
> descriptions (e.g., 'a country with the name xxx'). It would be useful to
> have canonical lists of urls for items on these lists to help with
> reconciliation, etc.
>
>  The proposal is to periodically (with every schema.org release) import
> particular lists from Wikidata into 'lists.schema.org'. So,
> 'lists.schema.org/Language/Tamil', 'lists.schema.org/Country/Mexico', etc.
> Along with each item (such as lists.schema.org/MexicoCountry), we should
> also import a set of attributes/relations from Wikidata so that we have a
> description of the node, which can help with recon using reference by
> description. It will also include a sameAs link to the corresponding
> Wikidata item.

I'm very sympathetic to this, so long as the Wikidata community don't
treat it as a hostile fork. If they would rather host the
vocabularies, that could also be be made to work.

Their vocabularies are perfectly capable of being used as 1st class
vocabularies around the Web anyway - I explored this in some detail
with Denny last year, see
https://github.com/schemaorg/schemaorg/issues/1186 and nearby
(https://github.com/schemaorg/schemaorg/issues/280 has a lot of work
around mappings from Thad and friends). The experiment from last year
(covered in #1186 above) pulled all the properties from Wikidata into
a dump that served only as a JSON-LD context definition (I was too
lazy to create full schema definitions). If we import Wikidata
properties into lists.schema.org, that's essentially the same idea.
At the time my preference would've been for Wikidata itself to
ultimately publish this (i.e. an 'external extension' from
schema.org's point of view), but for sake of a demo it is at
http://wdvoc-1323.appspot.com/ currently. I didn't attempt types at
the time since properties are more fundamental in the Wikidata data
model, but it would be interesting to understand whether Wikidata as a
project has any interest in hosting its own vocabulary as something
intended for use elsewhere. If not then we would seem as good a place
as anywhere for this to happen.

Also if we did this at lists.schema.org (or wd.schema.org or whatever)
we would need to think through and document the level of integration,
e.g. http://schema.org/Country already exists, whereas other concepts
may not have a good match.  See Thad's msg that arrived as I was
typing...

Someone is also bound to mention versioning and whether Wikidata's
type and property names are volatile and whether we should use Q12345
IDs instead; I suggest we defer that conversation for now.

>  Schema.org will also suggest the use of Wikidata as a common, canonical entity repository for targets of the sameAs relation.

Wikidata is by far the most obvious default/mainstream option here so
+1 from me, though we should make clear that e.g. highly technical
scientific data might prefer to sameAs to more controlled
repositories. But even there, Wikidata is winning over a lot of
converts...

Dan

> Denny & Guha

Received on Tuesday, 11 April 2017 16:43:18 UTC