W3C home > Mailing lists > Public > public-lod@w3.org > October 2010

Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages

From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
Date: Wed, 6 Oct 2010 19:49:10 +0200
Cc: public-lod@w3.org, semantic-web@w3.org
Message-Id: <21747522-3AD2-47D1-BB3A-FC672ADE4F3C@ebusiness-unibw.org>
To: Michael F Uschold <uschold@gmail.com>
Hi Michael,

> Michael,
>
> I had a look at some of the examples. Noteworthy is the apparent  
> lack of any product ontology.  Martin's example example is for a  
> camera with housing. An obvious way to model this is as a bundle  
> with two things: one of type Video Camera and one off type  
> UnderWaterHousing.  There is nothing of this sort. Rather, this and  
> perhaps all 900,000 items are of type: Product.  In other words,  
> there is no semantics at all for the products, no types, no  
> features, no constraints, nothing.
>
> Have I missed something?
Yes, two things:

1. It is a dangerous misconception to expect the original data  
publisher to do all the data cleansing and linking. Providers of  
dataspaces or complementing data services can add the missing pieces  
or cleanse the raw data from the LOD space.

2. Part of the product semantics can be originally exposed in textual  
properties and tokenized or extracted by someone else.

Take this data:

foo:myproduct a gr:ProductOrServiceSomeInstancesPlaceholder ;
	rdfs:label "Digital Camera"@en .

This is not perfect, but it's already much more accessible to SPARQL  
queries, and the armada of NLP techniques can be used to add the triple

	foo:myproduct a ceo:DigitalCamera .

in some other RDF graph anywhere on the Web.


>
> If this is true, the question is why.  Possibilities include:
> 	 Expedience:  It is conceptually trivial to convert the catalog to  
> RDFa this way.
It is too expensive to expect data owners to lift their existing data  
to academic expectations. You must empower them to preserve as much  
data semantics and data structure as they can provide ad hoc. Lifting  
and augmenting the data can be added later.

If you expect all shops in the world to classify their products  
according to Cyc or eClassOWL, they will not be able to publish any  
data.

> 	 First things first: it was just a first step, more semantics is  
> on the way...
In the long run, there will be an incentive to add more semantics to  
articulate your value proposion more clearly.

> 	 Lack of perceived value: Does it cost too much for what value  
> there may be?
See above - this way, publishing the data can be done easily. Adding  
Cyc or eClassOWL classification will cost a lot but not bring new  
business for the moment.

> I wonder what the value is for this first step.
Improved rendering in Yahoo plus visibility in many of the evolving  
eCommerce applications based on GoodRTelations.

> I wonder whether there are plans for adding semantics to the  
> products themselves.
>
I don't know, but as said, it need not to be the retailers that add  
the product master data.

Much more realistic is a scenarios in which
1. shops will typically just expose *offer* data and
2. manufacturers or data intermediaries will provide fine-grained  
product *feature* data.

Overstock uses a minimal subset of GoodRelations, sufficient for SEO,  
which will become more powerful when linked to other data.

In an ideal world, they would also immediately provide  
gr:hasMakeAndModel links to the URI of the respective camera model  
data (gr:ProductOrServiceModel) and/or narrow down the semantics of  
the product placeholder node from  
gr:ProductOrServicesSomeInstancesPlaceholder to the intersection of e.g.

   gr:ProductOrServicesSomeInstancesPlaceholder

and

   http://www.ebusiness-unibw.org/ontologies/consumerelectronics/v1#DigitalCamera


Example:

PREFIX o :  <http://www.overstock.com/Electronics/Bell-and-Howell-DV550UW-12MP-Digital-Video-Camera-with-Underwater-Housing/4450313/product.html# 
 >

o:product  a gr:ProductOrServicesSomeInstancesPlaceholder,  
ceo:DigitalCamera ;
		gr:hasMakeAndModel foo:DV550UW12MP.

foo:DV550UW12MP would be the make and model master data, defined  
somewhere else, e.g. on the manufacturer's page:

foo:DV550UW12MP a gr:ProductOrServiceModel, ceo:DigitalCamera ;
	ceo:weight ..... .


But even shallow structured offer data can be very useful, as long as  
there are strong identifiers attached. If overstock.com used UPC / EAN  
codes (gr:hasEAN_UCC-13) or manufacturer's part numbers (gr:hasMPN),  
which they unfortunately don't, it would be very easy to link the  
products to matching datasheets:

# Add gr:hasMakeOrModel links between models and products on the basis  
of identical EAN_UCC codes
CONSTRUCT {?product gr:hasMakeAndModel ?model}
WHERE
{
   ?model a gr:ProductOrServiceModel.
   {
     {?product a gr:ProductOrServicesSomeInstancesPlaceholder.}
     UNION
     {?product a gr:ActualProductOrServiceInstance.}
   }
   ?model gr:hasEAN_UCC-13 ?ean.
   ?product gr:hasEAN_UCC-13 ?ean.
   OPTIONAL {?product gr:hasMakeAndModel ?model2}
   FILTER (?ean!="" && ?model != ?model2)
}

Then, you can trigger the default GoodRelations axioms for adding  
model feature to products:

# Products inherit all product features from their product models  
unless they are defined for the products individually

CONSTRUCT {?product ?property ?valueModel.}
WHERE
{
  {
    {?product a gr:ActualProductOrServiceInstance.}
    UNION
    {?product a gr:ProductOrServicesSomeInstancesPlaceholder.}
  }
    ?model a gr:ProductOrServiceModel.
    ?product gr:hasMakeAndModel ?model.
    ?model ?property ?valueModel.
  {
    {?property rdfs:subPropertyOf  
gr:qualitativeProductOrServiceProperty.}
    UNION
    {?property rdfs:subPropertyOf  
gr:quantitativeProductOrServiceProperty.}
    UNION
    {?property rdfs:subPropertyOf gr:datatypeProductOrServiceProperty.}
  }
  OPTIONAL {?product ?property ?valueProduct.}
  FILTER (!bound(?valueProduct))
}


And SCHWUPP! ;-) you have very rich information about every single  
product from initially shallow shop data.

Martin

PS: The GoodRelations proprietary axioms are at

http://www.ebusiness-unibw.org/wiki/GoodRelationsOptionalAxiomsAndLinks


On 06.10.2010, at 19:15, Michael F Uschold wrote:

>
>
>
>
>
>
> On Wed, Oct 6, 2010 at 5:39 AM, Martin Hepp <martin.hepp@ebusiness-unibw.org 
> > wrote:
> Dear all:
>
> I am happy to announce that overstock.com, one of the major US  
> online retailers, has just added GoodRelations rich meta-data in  
> RDFa to ALL ca. 900,000 item pages.
>
> Example:
>  http://www.overstock.com/Electronics/Bell-and-Howell-DV550UW-12MP-Digital-Video-Camera-with-Underwater-Housing/4450313/product.html
>
> Sitemap:
>  http://www.overstock.com/googlemap.xml
>
> There is still a minor bug in the markup (regarding the position of  
> the rdf:type gr:UnitPriceSpecification statement), but I will notify  
> them immediately; the bug will also not break typical GoodRelations  
> queries.
>
> Best wishes
> Martin
>
> --------------------------------------------------------
> martin hepp
> e-business & web science research group
> universitaet der bundeswehr muenchen
>
> e-mail:  hepp@ebusiness-unibw.org
> phone:   +49-(0)89-6004-4217
> fax:     +49-(0)89-6004-4620
> www:     http://www.unibw.de/ebusiness/ (group)
>         http://www.heppnetz.de/ (personal)
> skype:   mfhepp
> twitter: mfhepp
>
> Check out GoodRelations for E-Commerce on the Web of Linked Data!
> =================================================================
> * Project Main Page: http://purl.org/goodrelations/
> * Quickstart Guide for Developers: http://bit.ly/quickstart4gr
> * Vocabulary Reference: http://purl.org/goodrelations/v1
> * Developer's Wiki: http://www.ebusiness-unibw.org/wiki/GoodRelations
> * Examples: http://bit.ly/cookbook4gr
> * Presentations: http://bit.ly/grtalks
> * Videos: http://bit.ly/grvideos
>
>
>
>
>
> -- 
> Michael Uschold, PhD
>    LinkedIn: http://tr.im/limfu
>    Skype: UscholdM
Received on Wednesday, 6 October 2010 17:49:51 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:29 UTC