Re: Another example of Wikidata + schema.org for type enumerations

On Feb 21, 2014, at 7:11 PM, Dan Brickley <danbri@google.com> wrote:

> Here's another example along the lines I sketched recently, after
> sitting down with Denny today and looking at Wikidata. It is an
> attempt to show how "types" handled externally from schema.org could
> be written (in this case in RDFa) alongside basic schema.org types. We
> revisit the use case of having more kinds of "place of workship" than
> are anticipated in the schema.org core.
> 
> A couple of things to note about Wikidata first:
> 
> 1. it does have basic properties for 'instance of' and 'subclass' but
> there is no formal or software-backed understanding of these. Most of
> Wikidata views these as simply more data about an entity. The Wikidata
> software does not really have a notion of entities having types, this
> is something added at a later level as data. Perhaps if type-like
> constructs become common and popular in the community some UI or API
> support might emerge (by very rough analogy, think about hashtags and
> retweets in Twitter, which initially were also "just in the data").
> I'll try to write type-like-entity instead of 'type' when talking
> about Wikidata.
> 
> 2. although the factual data in Wikidata is currently often fairly
> thin, there are already many mappings to other identifiers, e.g. 1-2
> million have Freebase links, which in turns brings in more factual
> background data.
> 
> So here's an example, we describe the entity
> https://www.wikidata.org/entity/Q2046262 (Pagoda Songyue) as falling
> into the wikidata type-like-entity
> https://www.wikidata.org/entity/Q199451 (Pagoda), and then anchor that
> in the schema.org 'PlaceOfWorship' type.
> 
> <div vocab="http://schema.org/" typeof="PlaceOfWorship
> https://www.wikidata.org/entity/Q199451">
>  <span property="name">Pagoda Songyue</span>
>   <span property="description">One of the few intact sixth-century
> pagodas in China, located at the Songyue Monastery on Mount
> Song.</span>
>  <link property="url" href="https://www.wikidata.org/entity/Q2046262" />
>  <link property="sameAs" href="https://en.wikipedia.org/wiki/Songyue_Pagoda" />
>  <link property="sameAs" href="http://www.freebase.com/m/03bz2xf" />
>  <div itemprop="geo" itemscope itemtype="http://schema.org/GeoCoordinates">
>      <meta itemprop="latitude" content="34.501611" />
>      <meta itemprop="longitude" content="113.015917" />
>   </div>
> </div>

How does <http://www.freebase.com/m/03bz2xf> bring in machine-readable factual information? It doesn't seem to have any markup. Although, <http://rdf.freebase.com/m/03bz2xf> ultimately does return some Turtle, but the content type is text/plain, so it's not too friendly to my distiller. Some content-negotiation across these would be useful and/or marking up the Freebase HTML using RDFa or microdata.

The Wikidata produced RDf is quite useful, though. In general, settling on Wikidata as a primary identifier space seems like a good idea to me, but it would be nice if at least one of the sameAs relationships was to something else that's useful, like an RDF version of a Freebase resource, or DBpedia (unless that's considered competition?). Perhaps Freebase could also declare an equivalent to the Wikidata page, they do both say they're equivalent to the same Wikipedia pages in this case, anyway. (Denny?)

Also, why use <https://www.wikidata.org/entity/Q2046262> as a schema:url reference rather than the subject (@resource)? Are you expressing a preference for anonymous resources?

> a) https://www.wikidata.org/entity/Q199451 is the type-like-entity,
> "Pagoda" in Wikidata
> b) https://www.wikidata.org/wiki/Property:P279 is the subClassOf
> relation, ("all of these items are instances of those items; this item
> is a class of that item").
> c) https://www.wikidata.org/wiki/Q1370598 is "Place of Worship" in
> Wikidata. This type-like-entity has associations with 'architectural
> structure' and 'religion' within Wikidata. This would be the natural
> place also to express a link to 'http://schema.org/PlaceOfWorship',
> but that statement hasn't yet been expressed, and there might be some
> details to work out on the mechanics.
> d) https://www.wikidata.org/wiki/Property:P31 is the instanceOf
> relation in Wikidata; ("this item is a concrete object (instance) of
> this class, category or object group").

In the context of the Sports vocabulary, this would imply using a Wikidata type identifier rather than a concrete subclass of SportsTeam. However, I can't find an appropriate Wikidata page right now. Searching for "American Football Team" does yield many specific teams, but they are instances of "sports team" <https://www.wikidata.org/wiki/Q12973014>. There is an American Football  resource <https://www.wikidata.org/wiki/Q41323>, but this is really more appropriate for the SportsDiscipline I suggested, then as a type of SportsTeam. Of course, being WikiData, we can always just create missing categories to use as types.

> Last time I posted something in this direction (funeral homes,
> http://lists.w3.org/Archives/Public/public-vocabs/2014Feb/0007.html)
> there was a concern (Aaron's, in
> http://lists.w3.org/Archives/Public/public-vocabs/2014Feb/0010.html )
> that schema.org search engines might not care about other namespaces
> so much. I'd like to explicitly set that aside for now, along with the
> concern the lack of UI for these types. It's reasonable to ask
> questions about both, but for now I'd like to concentrate on making
> sure the representational machinery matches up.

If we build it, they will come?

>> From looking at this today, it seems already possible to create
> descriptions that draw on "types" from Wikidata alongside matching
> broader types from schema.org. It seems possible within Wikidata to
> talk about an entity being an "instance of" a type that shows up as
> another entity within Wikidata, and for that type entity to have
> subClass links to broader wikidata types. The Wikidata machinery (and
> hopefully community process!) should also make it fairly easy to add
> properties linking these type-like entities to other types such as at
> schema.org, so that the wider community can keep collective notes on
> the relationships between the type-like entitites Q1370598  and
> Q199451 in Wikidata and schema.org's 'PlaceOfWorship' type.
> 
> In http://blog.schema.org/2012/05/schemaorg-markup-for-external-lists.html
> a while back from schema.org, we wrote about the importance of
> 'external enumerations'. Wikidata barely existed back then. Now that
> Wikidata is real, I'd like to encourage people to take a look. The
> potential for combining Wikidata and schema.org is well worth some
> thought...

+1, this is a vendor neutral namespace that the community can directly affect. Therefore, it is inherently more manageable and stable than either Freebase or DBpedia.

> Topic for another day: what do we do about enumerations where
> instantiation-oriented type hierarchies are not a great fit?  There
> are various properties in schema.org where we want to be more
> structured than saying "values are a string or url", but where
> schema.org as a project  doesn't directly want to draw up a list of
> all the possible values. For example restaurant/menu cuisines  (e.g.
> http://schema.org/servesCuisine http://schema.org/recipeCuisine) .
> Let's come back to that one in the context of MiniSKOS. Wikidata may
> have a role to play there too.

I don't really see Wikidata as a reasonable source of predicate identifiers, however. There are some relationship concepts in Wikidata now, such as "description" <https://www.wikidata.org/wiki/Q3024326> and "birthday" <https://www.wikidata.org/wiki/Q47223>, but I can't really see these showing up in actual data. I think indirecting through something like a Contribution, or Statistic class, which can relate the type of contribution (role) or statistic using an enumeration makes more sense, but certainly isn't as simple as just adding properties to the schema.org namespace.

Gregg

> cheers,
> 
> Dan
> 
> ps. for more background on Wikidata, see Denny's recent article at
> http://www.computer.org/portal/web/computingnow/content?g=53319&type=article&urlTitle=the-rise-of-wikidata
> 

Received on Saturday, 22 February 2014 21:39:56 UTC