Re: prevalence of schema.org/Book

Hi Jason,

On Mon, Jan 28, 2013 at 9:40 PM, Jason Ronallo <jronallo@gmail.com> wrote:
> Niklas,
>
> On Mon, Jan 28, 2013 at 3:25 PM, Niklas Lindström <lindstream@gmail.com> wrote:
>> On Fri, Jan 25, 2013 at 9:25 PM, Young,Jeff (OR) <jyoung@oclc.org> wrote:
>>> I believe that extension tokens added to Schema.org classes are
>>> interpreted as subclasses rather than as properties. It's possible they
>>> could deduce that these are intended to be properties in that class
>>> domain from the usage as such, but I haven't seen them consider the
>>> possibility. I suspect it will be GIGO, but maybe I missed it. Their
>>> advice is to have properties extend other properties like
>>> http://schema.org/creator/architect. Unlike schema:Thing, though, there
>>> is no top property.
>>
>> Yes, if they are interpreted at all, that would be it. I would
>> strongly advice against using this "extension" mechanism for anything
>> intended to persist and being used by others than syntax-centric
>> things and plain SEO experiments though. There is no guarantee that
>> IRIs minted like that refer to the same concept (only to the common
>> aspect shared by all classes/properties who in english (arguably)
>> share the same lexical camel-case labeling..) Nor that they will ever
>> be dereferenceable.
>
> Yes, the extension mechanism of Schema.org is of very limited use. If
> we find that others use it, then we can see if those are types or
> properties that ought to be included in Schema.org proper. For folks
> that don't know to or can't discover a type or property that works for
> them, then I could see folks using it. Even if they're using RDFa it
> can be confusing to find an appropriate type and property and then use
> it correctly. The extension mechanism is something that I would use
> though.

Yes, it may be that schema.org eventually incorporates common
extensions verbatim (using the slash notation) and publishes official
definitions of them. That can certainly be useful. But that may also
be on a path towards a hierarchical ever-growing giant vocabulary;
which is something that schema.org doesn't seem to aim for (rather
being a skeleton for common terms). Instead you can today use multiple
properties, in both RDFa and microdata (in both as full IRIs/URLs, in
RDFa also shortened using prefixes), to cater for different consumers
who understand different properties (or just some inlined
super-property, like from schema.org). Thereby you can use a property
defined by schema.org, which you can probably expect search engines to
pick up in e.g. facetted search, *and* any other property, from any
vocabulary (including your own), with defined meaning.

Which way will increase the likelihood of convergence by both
publishers and consumers (such as service providers like search
engines) is an open question. Currently, the slash-extension is
basically a syntax trick IMHO (with an implied super-property relation
but no stable way for defining specific meaning until it is perhaps
endorsed), whereas multiple meaningful vocabularies do exist, and
various consumers do understand various ones (which admittedly is an
untidy state of affairs with variable correlation).

>> I suspect there is something strange with the property IRIs in that
>> data. Jason, did those IRIs come from the source? Schema.org
>> properties have IRIs of the form <http://schema.org/{term}>, i.e. not
>> concatenated on a type. (As Jeff also mentioned; see [1] for details.)
>
> If you're referring to IRIs like http://schema.org/Book/name from my
> post, then that comes directly from the Web Data Commons data. If this
> is incorrect, which it appears to be, then it should be taken up with
> them. I was just taking the data as I was given it and spitting it
> back out. Here's a typical NQuad from the Web Data Commons corpus:
> _:nodeca28f5cf7b05162b4036f77a176718 <http://schema.org/Book/isbn>
> "978-3-902406-06-4"@en
> <http://www.seifertverlag.at/en/programme/2003_autumn/detail_pharao.php>
>   .

That's interesting, and may be troublesome. I see that the source of
that is using microdata (except for RDFa in the head for OGP). Since
the interpretation of microdata as RDF used to be a moving target,
with various options for constructing the property IRI (there still
are [1]), I suspect that that has at least in part caused this. It
should reasonably be investigated.

Cheers,
Niklas

[1]: http://www.w3.org/TR/microdata-rdf/#property-uri-generation


> Jason

Received on Monday, 28 January 2013 23:50:00 UTC