Re: Schema.org in RDF ...

On 15 Jun 2011, at 02:25, Alan Ruttenberg wrote:
> There is no reason to assume that parsing schema.org formatted syntax
> should be a trivial transformation to triples.

There is a W3C specification in Last Call status that defines how the syntax used by schema.org should be transformed to triples [1].

Best,
Richard

[1] http://www.w3.org/TR/microdata/#rdf


> I would handle the
> string thing at parsing time. First note that every schema.org thing
> can have a url property. That's what the uri ref would be made from
> for the (non string) resource. We need to handle several cases then,
> since the (syntactic) object of an object property assertion (a uri
> ref) might or might not reasonably be constructed from the surface
> value (yes when entity and the url property value is specified). So in
> the case where the range of a property is a non-string entity:
> 
> If the surface value is an entity, and there is a url property
> specified then it is used to construct the uri-ref for the object
> position. If there is no url property then you use a bnode in the
> object position.
> One infers (or simply asserts in translation) that the entity is of
> type specified by the range.
> One asserts the rest of the entity properties (I would map "name" to
> rdfs:label, personally).
> 
> If the surface value is a string, then a bnode is used in the object position.
> One infers (or simply asserts in translation) that the entity is of
> type specified by the range.
> One asserts in the translation that the rdfs:label of the entity is
> the surface value string.
> 
> If there are some properties that *legitimately* can either be strings
> or (non-string) entities then we have ambiguity. I haven't reviewed
> schema.org to see whether this happens, but the cases where the value
> of some property are *truly* a literal (not *defined* as a literal but
> actually carrying an entity label or encoding or whatnot) are, I
> suspect, easy to identify, and the bold claim would be that the
> legitimate literals would rarely be acceptable as property values
> alongside entities.
> 
> One cares about this also because schema.org is incomplete, so there
> are many properties whose ranges are currently declared as text, but
> for which one can expect that there will eventually be entity
> definitions. E.g. on CreativeWork: awards, currently:	Text, Awards won
> by this person or for this creative work. It's not that much of a
> stretch to think someone will want to define an award entity that has
> properties that relate it to the granting organization, the date
> awarded, etc.
> 
> The bottom line is that with some ( dare i say it ;-) ) relatively
> simple ontological analysis of the schema.org schemas, and with the
> translator from html to rdf written to take account of that analysis
> (by using bnode objects for properties whose objects are not sensibly
> literals), we don't need to accept the broken every-range-is-a-string.
> 
> -Alan
> 
> ps. I've heard those who want a "simple" rdfs translation, so you
> don't have to remind me. That doesn't strike me as a particularly good
> idea to me atm, so I'll keep discussing the alternative, thank you.
> 

Received on Wednesday, 15 June 2011 10:16:46 UTC