- From: Giovanni Tummarello <giovanni.tummarello@deri.org>
- Date: Wed, 21 Nov 2012 15:49:20 +0000
- To: Leo Sauermann <leo.sauermann@gnowsis.com>
- Cc: Paola Di Maio <paoladimaio10@googlemail.com>, semantic-web at W3C <semantic-web@w3c.org>
First of all thanks for the affection Leo :) it really comes trough > First > my BIASED OPINION on the discussion: > money spent on a technology is a good benchmark, > and revenue created/costs saved. > Example: This has worked for the cloud computing industry so far arguing > large datacenters VS your server in your attic > Trying to evaluate enterprise or large-scale technology as semantic web > is on "sociological grounds" is hard to do scientifically correct and > the resulting report may also be useful only for a very limited audience. i guess "how willing are enterprises to spend money on this" at very large is indeed a kind of societal test "is it providing ROI for the enterprises in our society".. great thing to measure as part of the larger effort if possible by the consortium. > I don't know the background and havent looked at the project, but this > one here caught my eye: > - "complexly structured data management systems" - is as generic as it > can get, nobody will be able to do this in quality or reuse the results > - "linked data benchmark council" - anyone doing linked data > commercially of for public needs (data.gov.uk, data.gv, > data.wien.gv.at, ...) will read a report on that topic we truly need a better term than "linked data" or a shared initiative to define it as the right term getting it clear from old definitions and arbitrary, unbacked "principles" Data that RDF is good at is the data that is "complexly structured" .. hard to deny. it is Knowledge Representation. >>* HOWEVER by demanding that the ability to solve a real world PROBLEM >>is benchmarked vs benchmarking something starting from triples and the >>assumption that it has to be a graph the usefullness of the outcome >>would be greaty enhanced. > > Gio, that argument is lost somewhere on the way, I don't get it. Your > sentence makes no sense, it does not parse somehow. What did you really > mean? > Apologizes. I mean that one way to get these benchmark to be much more useful and reach a wider audience is not look at the finger but instead at the task. to say "i have a DB of 100 fake triples how fast can i do these many joins" is looking at the finger to say "i have a real world dataset (freebase, chembl whatever) i have to know how many entities type X have this and that property" is looking at the actual problem to solve. one can get the real world dataset (that will almost never be in RDF natively) port it to RDF and do a sparql query. (cost=cost of porting to RDF, all the queries will run, probably slow) OR one can create a schema or one can convert them to JSON and convert into mongo. (cost = cost of porting to Mongo, but certain queries might not really run, others might be faster) i would love to see benchmarks that assum that there are many ways to handle structured information and show that RDF is indeed the best in some cases (while others are not) > I have experienced many CIOs and CTOs benchmarking the technology I > offer them by > * cost reduction > * time savings > * opportunity costs > * cost of alternative solutions > * time of implementation > * TCO > * ROI > > So benchmarking something for the ability to benchmark it against > something that could solve triple benchmark problems is not something a > decision maker who is confronted with the choice of "to LOD or not" will > want to think about. i am not sure i understand you here. All those things that you mentioned fall into the categories of broader and more real world tests than simply measuring edges numbers and queries/s . So i guess you'd agree that seeing these aspects measured would also be very very useful in driving adoption in industry. .. BUT.. :) its not our project or at least we're not part of it. so it really really really boils down to that consortium decisions Gio
Received on Wednesday, 21 November 2012 15:50:20 UTC