- From: Paul Houle <ontology2@gmail.com>
- Date: Thu, 7 Oct 2010 11:14:19 -0400
- To: Martin Hepp <martin.hepp@ebusiness-unibw.org>
- Cc: Michael F Uschold <uschold@gmail.com>, public-lod@w3.org, semantic-web@w3.org
- Message-ID: <AANLkTinSKnT1ajqGcrWKRyK5ayEb=YZtNupDDrsMhiUh@mail.gmail.com>
On Wed, Oct 6, 2010 at 1:49 PM, Martin Hepp <martin.hepp@ebusiness-unibw.org > wrote: > > > It is too expensive to expect data owners to lift their existing data to > academic expectations. You must empower them to preserve as much data > semantics and data structure as they can provide ad hoc. Lifting and > augmenting the data can be added later. Don't get the idea that "academic expectations" are better than "commercial expectations", they're just different. The whole point of Ontology2 is to commercize information extraction with a philosophy very much like what these folks are doing: http://rtw.ml.cmu.edu/papers/carlson-aaai10.pdf Now in some ways they've got something way more advanced than what I've got: however, they say that their ontology is populated "with 242,453 new facts with estimated precsion on 74%." For me, I can't get away with an estimated precision of 74%, I'd look like a total fool publishing data that dirty on the web, unless I can find some way to conceal the dirt. Talking with people who are interested in semantic technology for e-commerce, I find a common desire is to not only reduce the cost of human labor but to also build systems that attain superhuman accuracy in describing and categorizing products (at least better accuracy than the people who are doing this job today.) [Note also that the rate of fact extraction these guys are doing isn't so hot either... You can get 10^7-10^8 facts out of dbpedia+freebase covering a similar domain] From a commercial viewpoint, imperfect data is an opportunity. If I didn't have other projects ahead of it in the queue, I'd seriously be thinking about building a shopping aggregator that cleans up GoodRelations and other data, reconciles product identities, categorizes products, creates good product descriptions, and make something that improves on current affiliate marketing and comparison shopping systems. Note that the beauty of an ontology is in the eyes of a user. One user might want to have a broad but vague ontology of "products", they are happy to say that a digital camera is a :DigitalCamera. Other people might want to just cover the photography domain, but do it in great detail -- describing both the differences between digital cameras manufactured today but also lenses, and even covering, in great detail, vintage cameras that you might find on eBay. You can't say that one of these ontologies is better than the other. The best thing is to have all of these ontologies available [populated with data!] and to pick and choose the the ones that fit your needs.
Received on Thursday, 7 October 2010 15:14:51 UTC