Re: Socio technical/Qualitative metrics for LD Benchmarks from Giovanni Tummarello on 2012-11-21 (semantic-web@w3.org from November 2012)

From: Giovanni Tummarello <giovanni.tummarello@deri.org>
Date: Wed, 21 Nov 2012 10:04:04 +0000
To: paoladimaio10@googlemail.com
Cc: semantic-web at W3C <semantic-web@w3c.org>
Message-ID: <CAHHRs7hQTw1dS+dFiTTXjbSoLZwVdpfD=uyWkNSPxLSms4W2nA@mail.gmail.com>

Paola,

am i right in  understanding that you advocate some form of measuring
the actual viability and usefulness of LOD based solutions or system?
E.g. in the way you would get by interviewing senior enterprise IT
people etc? "why not doing it with the normal RDFBMS you have, what
are the true/real costs/savings associated with it?

if so i think this is admirable, and particularly extremely useful
however it might be outside the scope of that EU project if they want
to create a technical benchmark across the RDF triplestore vendors and
graph database vendors.

Unfortunate in my opinion that the project is called "linked data"
benchmark council as opposed as "complexly structured data management
systems" or something more neutral, credible and without the
connotation for the web retrieval aspects.

A middle ground could be to ask that the group not only benchmarks
graph solutions e.g. RDF but also relational and nosql systems that
can answer comparable queries given a minimum effort to be determined,

At least 3 categories could be emerging:

* queries which all system could answer e.g. Mongo, even Solr
* queries which only graph and RDFdbs can answer
* queries which only graph systems can answer (e.g. minimum path)

Note that some could say "not possible/fair.. because we have RDF to
begin with and we would have to turn a graph into a set of entity
descriptions to load it for example in mongoDB" this is FALSE, we
never have RDF to begin with, its always a relational structure of
some sort that is then turned into a graph. so a fair benchmark that
shows USEFULLness inthe end could easily start from having a lot of
entity and relationships and then move from there,,  either in RDF
direction or standard technology/NOSQL technologies

To wrap up;

* if the project has a technical nature its unlikely you'll be
successful if you speak about sociological benchmarking.
* HOWEVER by demanding that the ability to solve a real world PROBLEM
is benchmarked vs benchmarking something starting from triples and the
assumption that it has to be a graph the usefullness of the outcome
would be greaty enhanced.

e.g. i have the descriptions of GENES from an original XML or  RDBM
database, vs "i have these many triples--> which is false, you never
have triples to begin with" how do i get these answers?

good luck
Gio

 .

On Tue, Nov 20, 2012 at 2:29 PM, Paola Di Maio <paola.dimaio@gmail.com> wrote:
> d so many socio-technical dimensions crop up in the many presentations. It
> would important to develop a Benchmark (or set of benchmarks) capable of
> capturing and measuring them. I suggested that:

Received on Wednesday, 21 November 2012 10:04:55 UTC