Re: Socio technical/Qualitative metrics for LD Benchmarks from Leo Sauermann on 2012-11-21 (semantic-web@w3.org from November 2012)

From: Leo Sauermann <leo.sauermann@gnowsis.com>
Date: Wed, 21 Nov 2012 14:39:56 +0100
To: Giovanni Tummarello <giovanni.tummarello@deri.org>
CC: paoladimaio10@googlemail.com, semantic-web at W3C <semantic-web@w3c.org>
Message-ID: <50ACD9AC.4020803@gnowsis.com>
Hi Gio, Paola,

hehe, I see academic discussion here....
so I am happy to flame in.

First
my BIASED OPINION on the discussion:
money spent on a technology is a good benchmark,
and revenue created/costs saved.
Example: This has worked for the cloud computing industry so far arguing
large datacenters VS your server in your attic
Trying to evaluate enterprise or large-scale technology as semantic web
is on "sociological grounds" is hard to do scientifically correct and
the resulting report may also be useful only for a very limited audience.


2nd:
Credibility comes from doing something with quality and in a way that is
useful for others to build on.
I don't know the background and havent looked at the project, but this
one here caught my eye:
- "complexly structured data management systems" - is as generic as it
can get, nobody will be able to do this in quality or reuse the results
- "linked data benchmark council" - anyone doing linked data
commercially of for public needs  (data.gov.uk, data.gv,
data.wien.gv.at, ...) will read a report on that topic

3rd:
> * if the project has a technical nature its unlikely you'll be
> successful if you speak about sociological benchmarking.

+1

>* HOWEVER by demanding that the ability to solve a real world PROBLEM
>is benchmarked vs benchmarking something starting from triples and the
>assumption that it has to be a graph the usefullness of the outcome
>would be greaty enhanced.

Gio, that argument is lost somewhere on the way, I don't get it. Your
sentence makes no sense, it does not parse somehow. What did you really
mean?

I have experienced many CIOs and CTOs benchmarking the technology I
offer them by
* cost reduction
* time savings
* opportunity costs
* cost of alternative solutions
* time of implementation
* TCO
* ROI

So benchmarking something for the ability to benchmark it against
something that could solve triple benchmark problems is not something a
decision maker who is confronted with the choice of "to LOD or not" will
want to think about.

BUT
(and everything before the but does not count, as we know)
this is a scientific list, so it is safe to discuss the possibilities
here :-)

love and Kisses to Ireland to Gio,
Leo

It was Giovanni Tummarello who said at the right time 21.11.2012 11:04
the following words:
> Paola,
>
> am i right in  understanding that you advocate some form of measuring
> the actual viability and usefulness of LOD based solutions or system?
> E.g. in the way you would get by interviewing senior enterprise IT
> people etc? "why not doing it with the normal RDFBMS you have, what
> are the true/real costs/savings associated with it?
>
> if so i think this is admirable, and particularly extremely useful
> however it might be outside the scope of that EU project if they want
> to create a technical benchmark across the RDF triplestore vendors and
> graph database vendors.
>
> Unfortunate in my opinion that the project is called "linked data"
> benchmark council as opposed as "complexly structured data management
> systems" or something more neutral, credible and without the
> connotation for the web retrieval aspects.
>
> A middle ground could be to ask that the group not only benchmarks
> graph solutions e.g. RDF but also relational and nosql systems that
> can answer comparable queries given a minimum effort to be determined,
>
> At least 3 categories could be emerging:
>
> * queries which all system could answer e.g. Mongo, even Solr
> * queries which only graph and RDFdbs can answer
> * queries which only graph systems can answer (e.g. minimum path)
>
> Note that some could say "not possible/fair.. because we have RDF to
> begin with and we would have to turn a graph into a set of entity
> descriptions to load it for example in mongoDB" this is FALSE, we
> never have RDF to begin with, its always a relational structure of
> some sort that is then turned into a graph. so a fair benchmark that
> shows USEFULLness inthe end could easily start from having a lot of
> entity and relationships and then move from there,,  either in RDF
> direction or standard technology/NOSQL technologies
>
> To wrap up;
>
> * if the project has a technical nature its unlikely you'll be
> successful if you speak about sociological benchmarking.
> * HOWEVER by demanding that the ability to solve a real world PROBLEM
> is benchmarked vs benchmarking something starting from triples and the
> assumption that it has to be a graph the usefullness of the outcome
> would be greaty enhanced.
>
> e.g. i have the descriptions of GENES from an original XML or  RDBM
> database, vs "i have these many triples--> which is false, you never
> have triples to begin with" how do i get these answers?
>
> good luck
> Gio
>
>
>  .
>
>
>
>
> On Tue, Nov 20, 2012 at 2:29 PM, Paola Di Maio <paola.dimaio@gmail.com> wrote:
>> d so many socio-technical dimensions crop up in the many presentations. It
>> would important to develop a Benchmark (or set of benchmarks) capable of
>> capturing and measuring them. I suggested that:


-- 
Dr. Leo Sauermann
CEO and Founder

mail:   leo.sauermann@gnowsis.com
mobile: +436991GNOWSIS

Try:    http://www.getrefinder.com/
Follow: http://twitter.com/Refinder
Like:   http://www.facebook.com/Refinder
Learn:  http://www.getrefinder.com/about/blog
Received on Wednesday, 21 November 2012 13:40:27 UTC