W3C home > Mailing lists > Public > public-vocabs@w3.org > April 2014

Re: Generic Property-Value Proposal for Schema.org

From: <martin.hepp@ebusiness-unibw.org>
Date: Thu, 1 May 2014 01:26:22 +0200
Cc: W3C Web Schemas Task Force <public-vocabs@w3.org>
Message-Id: <E41401B4-9317-4958-BDB4-07E02919B2C6@ebusiness-unibw.org>
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Peter:
I think we simply have a very different understanding of what Web vocabularies should be like, and at which point in Web-scale information interchange the standardization of data semantics should take place. The working assumption of your camp is that data on the Web should be ready for, or close to, naive consumption by relatively simple computational operations. But this is an untested claim from the Semantic Web community.

>> As a side remark:
>> 
>> I have spent the last ten years with building product ontologies in OWL DL that extend GoodRelations by classes and properties, in total more than 40 such ontologies, see http://wiki.goodrelations-vocabulary.org/Vocabularies, with 40,000 classes and maybe 20,000 properties. They are perfect for a data consumer, and they are used in applications. However, we have not been able to convince site-owners at scale to use such vocabularies for mark-uping up their content. The main reason for that is that they have a very, very hard time lifting and cleansing their data to that level of formality.
> 
> Then let's stick to scraping web pages.

First, I do not think it is up to you decide.

Second, I think it is pretty obvious that having explicit data structures of the form

{
	name: "Input Voltage"
	min: 110
	max: 250
	unitCode : "VLT"
}

is indeed simpler to process for a computational device than

<div>Input Voltage: 110-250 V<div>

in particular if you have thousands of data instances that conform to the first structure. With the same argument you could forbid dictionaries in Python because they do not require a URI for the properties.

eClass alone precisely defines 16,000 properties for product types, and those are widely in use - do you suggest to add all 16,000 to http://schema.org/Product ?

And yes, in theory one could make an OWL DL vocabulary out of all of this, see [1] and [2]. The problem is, among many other problems, that "ontologizing" copyrighted standards is non-trivial, which is why we cannot host the OWL versions of eClass, UNSPSC, CPV, ProfiClass, etc. that our tool [2] generates, and that owners of data are typically not able to map their product feature data to those standards.

Sorry for being so frank, but I am constantly annoyed by people who complain every time schema.org and related developments do not follow their Semantic Web assumtions and predictions.

Martin

[1] Products and Services Ontologies: A Methodology for Deriving OWL Ontologies from Industrial Categorization Standards, in: Int'l Journal on Semantic Web & Information Systems (IJSWIS), Vol. 2, No. 1, pp. 72-99, January-March 2006.
[2] http://wiki.goodrelations-vocabulary.org/Tools/PCS2OWL
Received on Wednesday, 30 April 2014 23:26:55 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:39 UTC