W3C home > Mailing lists > Public > public-lod@w3.org > October 2010

Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 07 Oct 2010 12:15:09 -0400
Message-ID: <4CADF20D.4000109@openlinksw.com>
To: Paul Houle <ontology2@gmail.com>
CC: Martin Hepp <martin.hepp@ebusiness-unibw.org>, Michael F Uschold <uschold@gmail.com>, public-lod@w3.org, semantic-web@w3.org
  On 10/7/10 11:14 AM, Paul Houle wrote:
> On Wed, Oct 6, 2010 at 1:49 PM, Martin Hepp 
> <martin.hepp@ebusiness-unibw.org 
> <mailto:martin.hepp@ebusiness-unibw.org>> wrote:
>
>     It is too expensive to expect data owners to lift their existing
>     data to academic expectations. You must empower them to preserve
>     as much data semantics and data structure as they can provide ad
>     hoc. Lifting and augmenting the data can be added later.
>
>      Don't get the idea that "academic expectations" are better than 
> "commercial expectations",  they're just different.
>      The whole point of Ontology2 is to commercize information 
> extraction with a philosophy very much like what these folks are doing:
> http://rtw.ml.cmu.edu/papers/carlson-aaai10.pdf
>       Now in some ways they've got something way more advanced than 
> what I've got:  however,  they say that their ontology is populated 
> "with 242,453 new facts with estimated precsion on 74%."
>       For me,  I can't get away with an estimated precision of 74%,  
> I'd look like a total fool publishing data that dirty on the web,  
> unless I can find some way to conceal the dirt.  Talking with people 
> who are interested in semantic technology for e-commerce,  I find a 
> common desire is to not only reduce the cost of human labor but to 
> also build systems that attain superhuman accuracy in describing and 
> categorizing products (at least better accuracy than the people who 
> are doing this job today.)
>       [Note also that the rate of fact extraction these guys are doing 
> isn't so hot either... You can get 10^7-10^8 facts out of 
> dbpedia+freebase covering a similar domain]
>       From a commercial viewpoint,  imperfect data is an opportunity.

Yes, one that could enable folks like to you create "superhuman killer 
users" courtesy of the distinguishing accuracy from your particular 
Linked Data Space :-)

Your insignia (i.e., your data space URIs) is the key to controlling how 
your value works its way through the value chain (one that is inherently 
long-tailed) .


>   If I didn't have other projects ahead of it in the queue,  I'd 
> seriously be thinking about building a shopping aggregator that cleans 
> up GoodRelations and other data,  reconciles product identities,  
> categorizes products,  creates good product descriptions,  and make 
> something that improves on current affiliate marketing and comparison 
> shopping systems.

Yes!! These are the opportunities that a Linked Open Commerce Data Space 
[1] opens up etc..

>       Note that the beauty of an ontology is in the eyes of a user. 
> One user might want to have a broad but vague ontology of "products",  
> they are happy to say that a digital camera is a :DigitalCamera.  
> Other people might want to just cover the photography domain,  but do 
> it in great detail -- describing both the differences between digital 
> cameras manufactured today but also lenses,  and even covering,  in 
> great detail,  vintage cameras that you might find on eBay.
>       You can't say that one of these ontologies is better than the 
> other.  The best thing is to have all of these ontologies available 
> [populated with data!] and to pick and choose the the ones that fit 
> your needs.

Amen!!

Links:

1. http://linkedopencommerce.com -- Linked Open Commerce Data Space


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Thursday, 7 October 2010 16:28:12 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:29 UTC