Re: Semantic Web Challenge 2011 CfP and Billion Triple Challenge 2011 Data Set published. from Giovanni Tummarello on 2011-06-17 (public-lod@w3.org from June 2011)

From: Giovanni Tummarello <giovanni.tummarello@deri.org>
Date: Fri, 17 Jun 2011 13:35:18 +0200
To: Chris Bizer <chris@bizer.de>
Cc: Semantic Web <semantic-web@w3.org>, public-lod <public-lod@w3.org>, semanticweb@yahoogroups.com
Message-ID: <BANLkTikwd-=YJLpDtcckynsUg77uCiuFeg@mail.gmail.com>

> This year, the Billion Triple Challenge data set consists of 2 billion
> triples. The dataset was crawled during May/June 2011 using a random sample
> of URIs from the BTC 2010 dataset as seed URIs. Lots of thanks to Andreas
> Harth for all his effort put into crawling the web to compile this dataset,
> and to the Karlsruher Institut für Technologie which provided the necessary
> hardware for this labour-intensive task.****
>
> **
>


On a related note,

 while nothing can beat a custom job obviously,

i feel like reminding that those that don't have said mighty
time/money/resources that any amount of data that one wants  rom the
repositories in Sindice which we do make freely available for things like
this. (0 to 20++ billion triples, LOD or non LOD, microformats, RDFa, custom
filtered etc)

See the  TREC 2011 competition
http://data.sindice.com/trec2011/download.html (1TB+ of data!)  or the
recent W3C data anaysis which is leading to a new reccomendation (
http://www.w3.org/2010/02/rdfa/profile/data/)  etc.

trying to help.
Congrats on the great job guys of course for the Semantic web challenge
which is a long standing great initiative!
Gio

Received on Friday, 17 June 2011 11:36:08 UTC