Re: Schema.org in RDF ... from Alan Ruttenberg on 2011-06-15 (public-lod@w3.org from June 2011)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Wed, 15 Jun 2011 02:25:09 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: Richard Cyganiak <richard@cyganiak.de>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org>
Message-ID: <BANLkTim6LJMpt4A098LV947gkyRnNX3nRQ@mail.gmail.com>

On Mon, Jun 13, 2011 at 12:20 AM, Pat Hayes <phayes@ihmc.us> wrote:
> Well, if you say the range is xsd:string, then anything which is a value has to be a string, right? As for example (taken at random)
>
> schema:cookingMethod a rdf:Property;
>    rdfs:label "Cooking Method"@en;
>    rdfs:comment "The method of cooking, such as Frying, Steaming, ..."@en;
>    rdfs:domain schema:Recipe;
>    rdfs:range xsd:string;
>    rdfs:isDefinedBy <http://schema.org/Recipe>;
>
> This says that the range is xsd:string, so nothing other than an xsd string will be acceptable here.
>
> Am I missing something?
>

I think some people might be missing something. (well, not you Pat!
But this seemed the most appropriate place in the thread to add this
note)

There is no reason to assume that parsing schema.org formatted syntax
should be a trivial transformation to triples. I would handle the
string thing at parsing time. First note that every schema.org thing
can have a url property. That's what the uri ref would be made from
for the (non string) resource. We need to handle several cases then,
since the (syntactic) object of an object property assertion (a uri
ref) might or might not reasonably be constructed from the surface
value (yes when entity and the url property value is specified). So in
the case where the range of a property is a non-string entity:

If the surface value is an entity, and there is a url property
specified then it is used to construct the uri-ref for the object
position. If there is no url property then you use a bnode in the
object position.
One infers (or simply asserts in translation) that the entity is of
type specified by the range.
One asserts the rest of the entity properties (I would map "name" to
rdfs:label, personally).

If the surface value is a string, then a bnode is used in the object position.
One infers (or simply asserts in translation) that the entity is of
type specified by the range.
One asserts in the translation that the rdfs:label of the entity is
the surface value string.

If there are some properties that *legitimately* can either be strings
or (non-string) entities then we have ambiguity. I haven't reviewed
schema.org to see whether this happens, but the cases where the value
of some property are *truly* a literal (not *defined* as a literal but
actually carrying an entity label or encoding or whatnot) are, I
suspect, easy to identify, and the bold claim would be that the
legitimate literals would rarely be acceptable as property values
alongside entities.

One cares about this also because schema.org is incomplete, so there
are many properties whose ranges are currently declared as text, but
for which one can expect that there will eventually be entity
definitions. E.g. on CreativeWork: awards, currently:	Text, Awards won
by this person or for this creative work. It's not that much of a
stretch to think someone will want to define an award entity that has
properties that relate it to the granting organization, the date
awarded, etc.

The bottom line is that with some ( dare i say it ;-) ) relatively
simple ontological analysis of the schema.org schemas, and with the
translator from html to rdf written to take account of that analysis
(by using bnode objects for properties whose objects are not sensibly
literals), we don't need to accept the broken every-range-is-a-string.

-Alan

ps. I've heard those who want a "simple" rdfs translation, so you
don't have to remind me. That doesn't strike me as a particularly good
idea to me atm, so I'll keep discussing the alternative, thank you.

Received on Wednesday, 15 June 2011 01:25:57 UTC