W3C home > Mailing lists > Public > public-vocabs@w3.org > April 2014

Re: Generic Property-Value Proposal for Schema.org

From: Jason Douglas <jasondouglas@google.com>
Date: Wed, 30 Apr 2014 23:28:25 +0000
Message-ID: <CAEiKvUAFjuAnDL68svdJg423UCG2V6UNpT2c-mrPPrZz1dYHkA@mail.gmail.com>
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, "martin.hepp@ebusiness-unibw.org" <martin.hepp@ebusiness-unibw.org>
Cc: W3C Web Schemas Task Force <public-vocabs@w3.org>
If this is just for product specs, then why not propose the denormalization
at that constrained level rather than as a global concept?

I believe the sports working group was considering something similar for
"sports statistic."

-jason

On Wed Apr 30 2014 at 4:04:35 PM, Peter F. Patel-Schneider <
pfpschneider@gmail.com> wrote:

>
> On 04/30/2014 01:43 PM, martin.hepp@ebusiness-unibw.org wrote:
> > Peter:
> > On 29 Apr 2014, at 15:47, Peter F. Patel-Schneider <
> pfpschneider@gmail.com> wrote:
> >
> >> There appears to be quite a lot here. As far as I can tell, the
>  essence is to have a special property whose values are some sort of
> structure that represents some sort of pair of some sort of relationship
> and some sort of value.
> > Yes. It is about providing a mechanism that allows site owners to expose
> core meta-data for their content, even if they cannot lift their data to a
> higher degree of formality.
> >
> >> The fly in this ointment is in all the "some sort"s above.
> > This is a design feature, not a bug, same as ambiguity in human
> languages is often a feature, not a bug. We allow sites to speak in data
> even if they cannot speak Oxford English.
>
> I firmly believe that this *is* a bug.  I don't see any significant
> advantage
> of this proposal over allowing the attachment of RDB-style tables to
> entities.  Consumers will have to handle a wide variety of "columns" with
> little or no commonality between information coming from different sources.
>
> Sure, if you have considerable resources, you may be able to make sense of
> the
> heterogeneity, but I thought that the idea behind schema.org was to put
> some
> homogeneity on information, i.e., precisely to move away from the difficult
> aspects of human languages.
> >
> >> How are consumers of this information supposed to treat it? For
> example, what happens when there are multiple values, or the value doesn't
> fit within the min and max, or there are any number of situations that do
> not fit within the simplecases?
> > They will have to post-process this "proto-data" and apply a lot of
> heuristics, machine learning, NLP to lift the raw data to the data they use
> for the final purpose. This is the very nature of processing data from Web
> markup at scale, see my post on "proto-data",
> http://lists.w3.org/Archives/Public/public-vocabs/2013Oct/0293.html.
> >
> > But if Web sites are able to expose the core meta-data for such data,
> like
> >
> > - the name of the propery
> > - the value
> > - the unit
> > - some hint of a standard that defines this property
> >
> > this is already a huge improvement over the state of the art.
>
>
> I just don't see the advantage here.  Maybe there will be commonalities,
> but
> then surely the way forward is to put these commonalities into schema.org.
>
> >
> >> There are several examples on the proposal page (look intervals and
> ranges) that don't fit within the simple cases, showing how easy it is to
> slip outside the simple cases.
> >>
> > With mark-up at Web scale, there is no black-and-white view of what is
> inside and outside the intended cases.
>
> Umm. I said "simple", not "intended".  The point here is that if even the
> early examples slip into cases where the data values include non-formal
> aspects, then the consumer processing is going to be very messy and error
> prone.
> >
> > As a side remark:
> >
> > I have spent the last ten years with building product ontologies in OWL
> DL that extend GoodRelations by classes and properties, in total more than
> 40 such ontologies, see http://wiki.goodrelations-
> vocabulary.org/Vocabularies, with 40,000 classes and maybe 20,000
> properties. They are perfect for a data consumer, and they are used in
> applications. However, we have not been able to convince site-owners at
> scale to use such vocabularies for mark-uping up their content. The main
> reason for that is that they have a very, very hard time lifting and
> cleansing their data to that level of formality.
>
> Then let's stick to scraping web pages.
>
> >
> > Martin
>
>
> peter
>
> >
> >> peter
> >>
> >>
> >> On 04/29/2014 02:42 AM, martin.hepp@ebusiness-unibw.org wrote:
> >>> Dear all:
> >>>
> >>> I have just finalized a proposal on how to add support for generic
> property-value pairs to schema.org. This serves three purposes:
> >>>
> >>> 1. It will allow to expose product feature information from thousands
> of product detail pages from retailers and manufacturers.
> >>> 2. It will simplify the development of future extensions for specific
> types of products and services, because we do no longer need to standardize
> and define all relevant properties in schema.org and can instead defer
> the interpretation to the client.
> >>> 3. It will serve as a clean, generic extension mechanism for
> properties in schema.org
> >>>
> >>> The proposal with all examples is here:
> >>>
> >>>      https://www.w3.org/wiki/WebSchemas/PropertyValuePairs
> >>>
> >>> Your feedback will be very welcome.
> >>>
> >>> Best wishes / Mit freundlichen Grüßen
> >>>
> >>> Martin Hepp
> >>> -----------------------------------
> >>> martin hepp  http://www.heppnetz.de
> >>> mhepp@computer.org          @mfhepp
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
>
>
>
Received on Wednesday, 30 April 2014 23:28:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:39 UTC