- From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
- Date: Tue, 7 Jan 2014 19:41:45 +0100
- To: Barbara Starr <BarbaraStarr2009@gmail.com>, Olivier Austina <olivier.austina@gmail.com>
- Cc: SchemaDot Org <public-vocabs@w3.org>
In short: There are no freely available, large-scale e-commerce datasets available, mostly because: 1. It is a huge effort. 2. The data is dynamic, so you have to do it constantly. 3. There are IPR issues with releasing the results of such crawls. My group, for instance, has several huge crawls for internal research purposes, but we cannot simply put them online without the risk of being sued e.g. for copyright on product texts, or similar. At least a watertight legal clearance is outside our abilities. The http://webdatacommons.org/ crawl is an attempt, but since it relies on http://commoncrawl.org/, it misses the biggest part of data, because the crawl does not go deep enough into e-commerce sites. So if you want to use respective data for research purposes or your startup idea, you will have to crawl and consolidate the data on your own. Best Martin On Jan 1, 2014, at 9:38 PM, Barbara Starr wrote: > OR even here if you are comfortable with SPARL, etc: http://linkedopencommerce.com/ > > Hope that helps :) > > On Jan 1, 2014, at 9:18 AM, Olivier Austina <olivier.austina@gmail.com> wrote: > >> Hi, >> I am looking for e-commerce product description dataset. I am not targeting a specific product or specific ontology of the dataset. It can be form schema.org such as Good Relation or others. Any dataset is welcome. Thank you. >> >> Regards >> Olivier >> > -------------------------------------------------------- martin hepp e-business & web science research group universitaet der bundeswehr muenchen e-mail: hepp@ebusiness-unibw.org phone: +49-(0)89-6004-4217 fax: +49-(0)89-6004-4620 www: http://www.unibw.de/ebusiness/ (group) http://www.heppnetz.de/ (personal) skype: mfhepp twitter: mfhepp Check out GoodRelations for E-Commerce on the Web of Linked Data! ================================================================= * Project Main Page: http://purl.org/goodrelations/
Received on Tuesday, 7 January 2014 18:42:08 UTC