Re: Generic Property-Value Proposal for Schema.org

Martin,

Can you give some examples of how this style of data could be used by a
search engine or aggregator to drive interesting features? It seems like
it's pushing too much work to the consumer side. Every different
website/producer will come up with their own different terminology for the
same attributes, which sort of defeats the purpose of a common vocabulary.

Thanks,
Justin

On Wednesday, April 30, 2014, martin.hepp@ebusiness-unibw.org <
martin.hepp@ebusiness-unibw.org> wrote:

> Dear Francois-Paul:
> On 30 Apr 2014, at 09:14, Francois-Paul Servant <
> francoispaulservant@gmail.com <javascript:;>> wrote:
>
> > Dear Martin,
> >
> > some remarks regarding your proposal.
> >
> > Regarding the motivations:
> > - I agree that there is a strong motivation for such a proposal, and you
> name it in your second design principle: "No Lifting and Cleansing Barrier:
> Do not force site owners to lift or cleanse existing data."
> > You may have very precise data describing your products in a table that
> you could very well publish it as they are, but it is difficult to map
> columns and cells to external vocabularies (if such vocabularies exist). It
> should be possible to lift the data later.
>
> Great, thanks! I think automotive is a really nice example - we typically
> have lots of relevant car features, but it will be very tiring to define a
> global standard for all marketing-relevant features (and their
> authoritative translations etc.).
>
> >
> > - I'm less convinced by the argument "generic extension mechanism for
> properties at the level of schema.org". As you note, using external
> properties is a problem in microdata. But it is not the case in RDFa or in
> JSON-LD: RDF, by itself, provides a generic pattern for exposing
> characteristics for entities. I don't think that it is a big effort for a
> site owner to mint a URI for an additional property.
> >
> That is perfectly fine from my perspective. In fact, this is just a
> by-product of the proposal and I wanted to disclose that properly.
> However, note that e.g. for smaller sites, it is indeed a -- at least
> perceived -- problem to mint a URI for an additional property. Even big
> automotive players had external support for defining their OWL
> vocabularies;-)
>
> Think of hotels for instance - if they define room features, they will
> often not be able to use an existing URI nor define their own.
>
> We are in agreement that
>
> 1. in non-microdata syntax, external properties are in principle no
> problem and
> 2. in general, properly defined properties with a URI will be better, if
> available.
>
> The proposal is about filling that gap.
>
>
> > Regarding the proposal itself: in order to avoid having to define many
> properties in schema.org, you propose an alternative, simplified way to
> write s p o triples when describing a resource s, using one and only one
> property, schema:additionalProperty, whose range is schema:PropertyValue.
> Basically, PropertyValue is a pair (property,value). You describe a
> PropertyValue using a few properties: schema:name, schema;value,
> schema:unitText, etc.
> >
> > I would keep and make explicit the (property,value) pair structure,
> using two dedicated properties (say): schema:property and schema:object,
> both with domain PropertyValue
> > Why? to make it possible to easily lift data published using
> schema:additionalProperty, in bulk.
>
> If I understand you correctly, you are proposing to create individual
> nodes for the property name part and for the value part. I have looked at
> the proposal, but
>
> - I see no gain in using a dedicated property node for the property name.
> If you already have a URI for the property, simply use propertyID with the
> URI of the property and omit the schema:name. That is as simple as your
> proposal.
>
> - If you already have a URI for the value (e.g. for a qualitative value),
> you can use schema:value directly with that URI.
>
> I will show that in your examples:
>
> >
> > Let's take some of your examples to explain it:
> >
> > <div itemtype="http://schema.org/Product">
> >       <img itemprop="image" src="camera123.jpg" />
> >       <span itemprop="name">Digital Camera 123</span>
> >       <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
> >               <span itemprop="name">Approx. Weight</span>
> >               <span itemprop="value">450</span>
> >               <span itemprop="unitText">gram</span>
> >       </div>
> >       <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
> >               <span itemprop="name">Interface</span>:
> >               <span itemprop="value">USB</span>
> >       </div>
> > </div>
> >
> > that is in turtle (for lisibility):
> >
> > [     a schema:Product;
> >       schema:image x:camera123.jpg;
> >       schema:name "Digital Camera 123";
> >       schema:additionalProperty [
> >               a schema:PropertyValue;
> >               schema:name "Approx. Weight";
> >               schema:value "450";
> >               schema: unitText "gram"
> >       ];
> >       schema:additionalProperty [
> >               a schema:PropertyValue;
> >               schema:name "Interface";
> >               schema:value "USB";
> >       ]
> > ]
>
>
> Yes
>
> >
> > I suggest to write instead:
> >
> > [     a schema:Product;
> >       schema:image x:camera123.jpg;
> >       schema:name "Digital Camera 123";
> >       schema:additionalProperty [
> >               a schema:PropertyValue;
> >               schema:property [
> >                       schema:name "Approx. Weight"
> >               ];
> >               schema:object [
> >                       schema:value "450";
> >                       schema: unitText "gram"
> >               ]
> >       ];
> >       schema:additionalProperty [
> >               a schema:PropertyValue;
> >               schema:property [
> >                       schema:name "Interface"
> >               ];
> >               schema:object [
> >                       schema:value "USB";
> >               ]
> >       ]
> > ]
> >
> > Not really different, not more difficult to produce, arguably more blank
> nodes.
>
> In Microdata, it would be more difficult to produce, also, we would need
> (or should then at least have), a type for these subnodes.
>
> Your proposal in Microdata would look as follows:
>
> <div itemtype="http://schema.org/Product">
>         <img itemprop="image" src="camera123.jpg" />
>         <span itemprop="name">Digital Camera 123</span>
>         <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
>                 <div itemprop="property" itemscope itemtype="
> http://schema.org/Property">
>                         <span itemprop="name">Approx. Weight</span>
>                 </div>
>                 <div itemprop="object" itemscope itemtype="
> http://schema.org/StructuredValue">
>                         <span itemprop="value">450</span>
>                         <span itemprop="unitText">gram</span>
>                 </div>
>         </div>
>         <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
>                 <div itemprop="property" itemscope itemtype="
> http://schema.org/Property">
>                         <span itemprop="name">Interface</span>:
>                 </div>
>                 <div itemprop="object" itemscope itemtype="
> http://schema.org/QuantitativeValue">
>                         <span itemprop="value">USB</span>
>                 </div>
>         </div>
> </div>
>
> That are 21 lines in comparison to the initial proposal with 13 lines:
>
> <div itemtype="http://schema.org/Product">
>         <img itemprop="image" src="camera123.jpg" />
>         <span itemprop="name">Digital Camera 123</span>
>         <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
>                 <span itemprop="name">Approx. Weight</span>
>                 <span itemprop="value">450</span>
>                 <span itemprop="unitText">gram</span>
>         </div>
>         <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
>                 <span itemprop="name">Interface</span>:
>                 <span itemprop="value">USB</span>
>         </div>
> </div>
>
> It is doable to modify the proposal, but from a Web markup perspective, I
> am not convinced. My main concern is not so much the additional code as
> such, but the experience that each additional level of nesting makes RDFa
> and Microdata coding more error-prone and intellectually more challenging.
>
> Imagine doing this in a non-trivial table in RDFa or Microdata. It will be
> very painful.
>
>
> > The point is that in many cases, you have URIs for the values, or you
> can easily mint them from your own codification. And you can therefore
> easily produce, say:
> >
> > [     a schema:Product;
> >       schema:image x:camera123.jpg;
> >       schema:name "Digital Camera 123";
> >       schema:additionalProperty [
> >               a schema:PropertyValue;
> >               schema:property foo:approxWeight;
> >               schema:object [
> >                       schema:value "450";
> >                       schema: unitText "gram"
> >               ]
> >       ];
> >       schema:additionalProperty [
> >               a schema:PropertyValue;
> >               schema:property foo:interface;
> >               schema:object foo:USB
> >       ]
> > ]
> > foo:approxWeight schema:name "Approx. Weight".
> > foo:interface schema:name "Interface".
> > foo:USB schema:value "USB".
> >
> > The advantage here is that this data can be later improved, for instance
> stating:
> >
> > foo:approxWeight rdfs:subPropertyOf schema:weight.
> > foo:USB owl:sameAs dbpedia:USB.
> >
> > this can be done without any impact on the source systems, on the actual
> production of the data, or on data that are already published: you can
> write the statements above once and lift all corresponding records at once.
>
> I think we should separate the issue of consuming this data in RDF worlds
> from the perspective of mark-up. My assumption of consuming such data in
> RDF worlds is that with SPARQL CONSTRUCT rules (and a few heuristics),
> RDF-based consumers will transform the property-value pairs into local
> schemas in RDFS or OWL or map the data to existing vocabularies (like
> http://purl.org/vso/ns).
>
> As long as the nodes are blank nodes, you cannot add a name later on
> anyway, so SPARQL CONSTRUCT works as well.
>
> It may not be obvious, but we only disagree on the tiny little bit whether
> future lifting and cleansing should happen on the original node (often a
> BNode), or in a copy of that data in the target data structure.
>
> Note also that in pure RDF worlds, including RDFa, there is no strong need
> to use the new pattern. You can always use proper RDF or OWL properties.
> The only downside is that search engines may skip such additional
> properties.
>
> If you are referring to externally defined URIs for the value or property,
> you can directly use those:
>
> <div itemtype="http://schema.org/Car">
>   <img itemprop="image" src="station_waggon123.jpg" />
>   <span itemprop="name">Station Waggon 123</span>
>   <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
>           <span itemprop="name">Gearbox Type</span>:
>           <link itemprop="value" href="http://purl.org/vvo/ns#GearboxDSG"
> />VW DSG
>           <link itemprop="propertyID" href="http://purl.org/vvo/ns#gearbox"
> />
>   </div>
> </div>
>
> In RDFa and JSON-LD, you could of course directly use the equivalent of
>
> s vvo:gearbox vvo:GearboxDSG .
>
> But even in this bordeline case I think that my proposal has advantages,
> since a search engine can partly process the meta-data without fully
> understanding the external vocabulary.
>
>
> >
> > A question we would then ask is the question of rules than can be linked
> to the use of schema:additionalProperty. Is it equivalent to state:
> > s schema:additionalProperty [
> >       schema:property p;
> >       schema:object o
> > ]
> >
> > and s p o?
> >
> In my proposal: Formally, no. But a client would likely consolidate this.
>
> However, I would like to limit the discussion of the exact processing of
> such data out of this thread, for eventually, the sponsors of schema.orgwill have to decide whether and how they will use such mark-up.
>
>
> > Also note that in many cases, you actually don't care about the
> property. An example describing cars:
> > [     a vso:Vehicle;
> >       schema:additionalProperty [
> >               schema:object [ schema:name "Sunroof" ]
> >       ],[
> >               schema:object dbpedia:Diesel
> >       ]
> > ]
> >
> > but we probably would prefer to write something like:
> > [     a vso:Vehicle;
> >       schema:feature [ schema:name "Sunroof"],
> >       schema:feature dbpedia:Diesel
> > ]
> >
>
> I think that Sunroof: Yes and fuel type: Diesel would be better and not
> more diffcult to produce:
>
> <div itemtype="http://schema.org/Car">
>   <img itemprop="image" src="station_waggon123.jpg" />
>   <span itemprop="name">Station Waggon 123</span>
>   <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
>           <span itemprop="name">Sunroof</span>
>           <meta itemprop="value" content="True">
>   </div>
>   <div itemprop="additionalProperty" itemscope itemtype="
> http://schema.org/PropertyValue">
>           <span itemprop="name">Fuel type</span>:
>           <link itemprop="value" href="http://dbpedia.org/resource/Diesel"
> />Diesel
>   </div>
> </div>
>
>
> >
> > (note BTW that your use of schema:name for the PropertyValue is a bit
> incorrect, as you do not use it to label the PropertyValue pair, but the
> property. A schema:name for the second of the examples should probably be
> "Interface: USB" - but Ok, that's not important)
> That is a separate issue to discuss. I thought about schema:propertyName,
> but then again, it is in most cases redundant, and I see little harm in
> overloading schema:name here. I have added it to the list of issues.
>
>
>
> > Best Regards,
> >
> > fps
> >
>
> Thanks for your substantial feedback!
>
> Best
>
> Martin
>
> > Le 29 avr. 2014 à 11:42, martin.hepp@ebusiness-unibw.org <javascript:;>a écrit :
> >
> > Dear all:
> >
> > I have just finalized a proposal on how to add support for generic
> property-value pairs to schema.org. This serves three purposes:
> >
> > 1. It will allow to expose product feature information from thousands of
> product detail pages from retailers and manufacturers.
> > 2. It will simplify the development of future extensions for specific
> types of products and services, because we do no longer need to standardize
> and define all relevant properties in schema.org and can instead defer
> the interpretation to the client.
> > 3. It will serve as a clean, generic extension mechanism for properties
> in schema.org
> >
> > The proposal with all examples is here:
> >
> >   https://www.w3.org/wiki/WebSchemas/PropertyValuePairs
> >
> > Your feedback will be very welcome.
> >
> > Best wishes / Mit freundlichen Grüßen
> >
> > Martin Hepp
> > -----------------------------------
> > martin hepp  http://www.heppnetz.de
> > mhepp@computer.org <javascript:;>          @mfhepp
> >
> >
> >
> >
> >
> >
> >
>
>
>

Received on Wednesday, 30 April 2014 21:56:25 UTC