- From: glenn mcdonald <glenn@furia.com>
- Date: Tue, 5 Apr 2011 20:26:01 -0400
- To: lotico-list@googlegroups.com
- Cc: Kingsley Idehen <kidehen@openlinksw.com>, "public-lod@w3.org" <public-lod@w3.org>, "semantic-web@w3.org" <semantic-web@w3.org>
- Message-ID: <BANLkTimU=NUB5VqcLkWi2my1KCCivKPbxA@mail.gmail.com>
> > On the issue of Triple Counts, you can't make sense of Data if you can't > count it. And your public instance *can't* count it, since all your non-trivial queries time out. Also, the point of Jeff Jonas' thing about counting is not producing *some* number, but producing *correct* numbers. How many unique real-world things are represented by those billions of triples? You have no idea. This is not a failing of Virtuoso or SPARQL, but it's a terminal failing of dbpedia as a data set. And until you can explain how anybody would create a "Global Linked Data Space" that would actually *make sense *to query, it doesn't matter much whether you or anybody else can query it. Exhibit #1 -- how do we Find the proverbial needle in a haystack via ad-hoc > queries at Web Scale? > Not sure what you mean by "exhibit" here. Your queries timeout, so unless the needle happens to be in the first page of the haystack, you're not going to find it. Exhibit #2 -- how do we leverage faceted exploration and navigation of > massive data sets at Web Scale? > I thought I knew what "faceted exploration" meant, but your "facet" example has nothing I recognize as a facet, so I'm not sure what your claim is here. Exhibit #3 -- how do we perform ad-hoc declarative queries (Join and > Aggregates variety) that used to be confined to a local Oracle, SQL Server, > DB2, Informix, MySQL etc.., at Web Scales esp. if the Web is now a Global > Linked Data Space? > Again, it sounds like your effective answer is "we don't". At least not if we actually care about the results, and we want them in some reasonable amount of time. I'm actually fine with this answer, but I think you're claiming you have a different answer. I've issued a challenge to all BigData players to show me a public endpoint > that allows me to perform any of the tasks above. Thus far, the silence has > been predictably deafening :-) > I'm not sure which "BigData players" you're superciliously calling out here (and certainly me and my project aren't among them), but I suspect the "silence" is due to your challenge being both hard to follow and wildly irrelevant to their concerns. They're not concerned with public endpoints, they have very limited interest in ad-hoc-ness, they certainly don't care about the sprawling mess of dbpedia, and they can't tolerate queries that run for many seconds and still only deliver partial results. You're not engaged in the same enterprise. Or, more precisely, your tech and their tech may inhabit the same category in some sense, but your public demos and their private enterprise systems do not. Yes and No. As will all of these matter utility lies in the eyes and fingers > of the data beholder. Seems like this pattern keeps reliably repeating: you post some dbpedia-based demo that, to you, demonstates some quality of Virtuoso or some supposed virtue of Linked Data as a concept. Then somebody actually bothers to look at the details of what you posted, and points out some glaring lameness about it. Then you blame that lameness on somebody else ("That's a question for the team at RPI :-)"), and simultaneously insist on the subjectivity of all quality assessments. I don't buy it. If you're going to include Hendler's 6.4 billion CSV-cell triples in your 21-billion brag, then you have to stand up for them and explain why they're valuable. If you're going to keep holding up dbpedia as an example, you need to start showing some actual uses of it. Show us a human use-case that it's actually good for, which "union of all Attributes associated with Entities that are associated with the pattern 'New York'" very much is not. glenn
Received on Thursday, 7 April 2011 01:30:40 UTC