W3C home > Mailing lists > Public > public-vocabs@w3.org > May 2014

Re: Generic Property-Value Proposal for Schema.org

From: Mike Bergman <mike@mkbergman.com>
Date: Sat, 03 May 2014 10:35:33 -0500
Message-ID: <53650CC5.2040901@mkbergman.com>
To: Kingsley Idehen <kidehen@openlinksw.com>, public-vocabs@w3.org
+1, and +1 to Holger's prior comments.

See further below:

On 5/3/2014 8:50 AM, Kingsley Idehen wrote:
> On 5/2/14 9:20 PM, martin.hepp@ebusiness-unibw.org wrote:
>> But before people are again afraid of reinventing RDF inside
>> schema.org: This is really an edge case, and in real-scale scenarios,
>> the transformation of the raw PropertyValue data into triples or
>> equivalent structures will involve more advanced processing -
>> heuristics, machine learning, ...
> You make a bold assumption for the Web populace, as a whole. Basically,
> you are assuming that users, developers, designers etc.. will never
> understand the semantics expressed in an RDF triple.

> You are also
> assuming that all data consumers will be resigned to their own
> heuristics for basic structured data processing and relations semantics
> comprehension.

As best as I can tell, Martin's proposal is for a syntactic mechanism to 
address a gap in microdata. The proposed solution could and would result 
in a proliferation of new structured data that is, potentially, 
undefined, unscoped and undecipherable as to meaning: in short, lacking 
any semantics. Such schema-less structured data would, I believe, belie 
the purpose of schema.org.

GS1's proposal to make its considerable and coherent schema for GPC 
available as open, linked data would provide one option for well-defined 
semantics (in the product realm) when one needs to refer to *both* goods 
and their associated properties. Other standards with well-defined data 
dictionaries may choose to do the same.

(I am not arguing for GPC per se, but better the reliance on one or more 
data dictionaries and vocabularies that have already been subject to the 
scrutiny of time and the market. Simply because gaps in structured data 
coverage presently exist in schema.org does not necessarily equate to a 
Katy-bar-the-door, all comers are welcomed extension mechanism. I think 
with a bit of patience we will see additional, well-considered 
vocabularies become available for reference by schema.org.)

Having referenceable semantics and well defined entities and properties 
provides a clear alternative to Martin's proposal. I would personally 
like to see established schema and established data dictionaries 
percolate into possible additions to schema.org, rather than premature 
adoption of extension mechanisms that leave data consumers scratching 
their heads as to what the structured data means.

We (Structured Dynamics and Aleksander Pohl) have nearly completed our 
efforts of mapping the expanded classes and properties of both 
schema.org and the DBpedia ontology to UMBEL (itself a lightweight 
subset of Cyc). It has been a surprisingly difficult exercise.

With growth, both schema.org and the DBpedia ontology are both becoming 
more incoherent. Peter Patel-Schneider has separately pointed out some 
of the internal inconsistencies in these vocabularies. I can confirm 
that incoherence is now much worse than the last mappings we did about 
three years ago, when both schema.org and the DBpedia ontology had about 
half of the scope of today. I don't believe that the search engines 
would want to see lesser coherence in their knowledge graphs going forward.

I think it is clear that the major search engines that sponsor 
schema.org are being quite careful in what structured data surfaces into 
their search results. Spam needs to be guarded against; assignments need 
to be vetted as accurate; and, fundamentally, the schema itself setting 
the structure of the knowledge graphs needs to make sense (be coherent).

The real contributors to schema.org are the owners of goods, services 
and content, desirous for consumers to find their offerings. The 
operative question should not be how we can find ways to get all local 
structured data into schema.org, but rather how we can do so that 
actually gets picked up and used by the search engines. I personally can 
not see any circumstance where undefined properties and connections 
achieve this aim.


> As indicated by my +1 to Holger's comments, why don't you simply confine
> this heuristic to Microdata since this is the syntax with the
> limitation? Good documentation and examples will enable those that have
> to work with Microdata (no matter what) go with this suggestion, if it
> suits their needs.


Michael K. Bergman
CEO  Structured Dynamics LLC
Received on Saturday, 3 May 2014 15:36:10 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:41 UTC