Re: Generic Property-Value Proposal for Schema.org

On 04/30/2014 01:43 PM, martin.hepp@ebusiness-unibw.org wrote:
> Peter:
> On 29 Apr 2014, at 15:47, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>
>> There appears to be quite a lot here. As far as I can tell, the  essence is to have a special property whose values are some sort of structure that represents some sort of pair of some sort of relationship and some sort of value.
> Yes. It is about providing a mechanism that allows site owners to expose core meta-data for their content, even if they cannot lift their data to a higher degree of formality.
>
>> The fly in this ointment is in all the "some sort"s above.
> This is a design feature, not a bug, same as ambiguity in human languages is often a feature, not a bug. We allow sites to speak in data even if they cannot speak Oxford English.

I firmly believe that this *is* a bug.  I don't see any significant advantage 
of this proposal over allowing the attachment of RDB-style tables to 
entities.  Consumers will have to handle a wide variety of "columns" with 
little or no commonality between information coming from different sources.

Sure, if you have considerable resources, you may be able to make sense of the 
heterogeneity, but I thought that the idea behind schema.org was to put some 
homogeneity on information, i.e., precisely to move away from the difficult 
aspects of human languages.
>
>> How are consumers of this information supposed to treat it? For example, what happens when there are multiple values, or the value doesn't fit within the min and max, or there are any number of situations that do not fit within the simplecases?
> They will have to post-process this "proto-data" and apply a lot of heuristics, machine learning, NLP to lift the raw data to the data they use for the final purpose. This is the very nature of processing data from Web markup at scale, see my post on "proto-data", http://lists.w3.org/Archives/Public/public-vocabs/2013Oct/0293.html.
>
> But if Web sites are able to expose the core meta-data for such data, like
>
> - the name of the propery
> - the value
> - the unit
> - some hint of a standard that defines this property
>
> this is already a huge improvement over the state of the art.


I just don't see the advantage here.  Maybe there will be commonalities, but 
then surely the way forward is to put these commonalities into schema.org.

>
>> There are several examples on the proposal page (look intervals and ranges) that don't fit within the simple cases, showing how easy it is to slip outside the simple cases.
>>
> With mark-up at Web scale, there is no black-and-white view of what is inside and outside the intended cases.

Umm. I said "simple", not "intended".  The point here is that if even the 
early examples slip into cases where the data values include non-formal 
aspects, then the consumer processing is going to be very messy and error prone.
>
> As a side remark:
>
> I have spent the last ten years with building product ontologies in OWL DL that extend GoodRelations by classes and properties, in total more than 40 such ontologies, see http://wiki.goodrelations-vocabulary.org/Vocabularies, with 40,000 classes and maybe 20,000 properties. They are perfect for a data consumer, and they are used in applications. However, we have not been able to convince site-owners at scale to use such vocabularies for mark-uping up their content. The main reason for that is that they have a very, very hard time lifting and cleansing their data to that level of formality.

Then let's stick to scraping web pages.

>
> Martin


peter

>
>> peter
>>
>>
>> On 04/29/2014 02:42 AM, martin.hepp@ebusiness-unibw.org wrote:
>>> Dear all:
>>>
>>> I have just finalized a proposal on how to add support for generic property-value pairs to schema.org. This serves three purposes:
>>>
>>> 1. It will allow to expose product feature information from thousands of product detail pages from retailers and manufacturers.
>>> 2. It will simplify the development of future extensions for specific types of products and services, because we do no longer need to standardize and define all relevant properties in schema.org and can instead defer the interpretation to the client.
>>> 3. It will serve as a clean, generic extension mechanism for properties in schema.org
>>>
>>> The proposal with all examples is here:
>>>
>>>      https://www.w3.org/wiki/WebSchemas/PropertyValuePairs
>>>
>>> Your feedback will be very welcome.
>>>
>>> Best wishes / Mit freundlichen Grüßen
>>>
>>> Martin Hepp
>>> -----------------------------------
>>> martin hepp  http://www.heppnetz.de
>>> mhepp@computer.org          @mfhepp
>>>
>>>
>>>
>>>
>>>
>>

Received on Wednesday, 30 April 2014 23:03:55 UTC