LOD Cloud Cache Stats from Kingsley Idehen on 2011-04-02 (semantic-web@w3.org from April 2011)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 02 Apr 2011 17:55:14 -0400
To: "public-lod@w3.org" <public-lod@w3.org>, Virtuoso Users <virtuoso-users@lists.sourceforge.net>, "semantic-web@w3.org" <semantic-web@w3.org>, lotico-list@googlegroups.com
Message-ID: <4D979B42.8030806@openlinksw.com>

All,

I've knocked up a Google spreadsheet that contains stats about our 21
Billion Triples+ LOD cloud cache.

On the issue of Triple Counts, you can't make sense of Data if you can't
count it. We can't depend on SPARQL-FED for distributed queries, and we
absolutely cannot depend on a Web crawl via follow-your-nose pattern
when seeking insights or answers to queries across massive volumes of data.

The whole BigData game is a huge opportunity for Linked Data and
Semantics to finally shine. By shine I mean: show what was erstwhile
impossible.

Exhibit #1 -- how do we Find the proverbial needle in a haystack via
ad-hoc queries at Web Scale?

Exhibit #2 -- how do we leverage faceted exploration and navigation of
massive data sets at Web Scale?

Exhibit #3 -- how do we perform ad-hoc declarative queries (Join and
Aggregates variety) that used to be confined to a local Oracle, SQL
Server, DB2, Informix, MySQL etc.., at Web Scales esp. if the Web is now
a Global Linked Data Space?

I've issued a challenge to all BigData players to show me a public
endpoint that allows me to perform any of the tasks above. Thus far, the
silence has been predictably deafening :-)

Links:

1.
https://spreadsheets.google.com/ccc?key=0AihbIyhlsQSxdHViMFdIYWZxWE85enNkRHJwZXV4cXc&hl=en
-- LOD Cloud Cache SPARQL stats queries and results

Regards,

Kingsley Idehen
President& CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Received on Saturday, 2 April 2011 21:55:49 UTC