W3C home > Mailing lists > Public > semantic-web@w3.org > April 2015

Re: Algorithm evaluation on the complete LOD cloud?

From: Gannon Dick <gannon_dick@yahoo.com>
Date: Thu, 23 Apr 2015 10:14:01 -0700
Message-ID: <1429809241.60373.YahooMailBasic@web122902.mail.ne1.yahoo.com>
To: "public-lod@w3.org" <public-lod@w3.org>, SW-forum Web <semantic-web@w3.org>, Laurens Rietveld <laurens.rietveld@vu.nl>
Hi Laurens,

Ignore the hecklers, I know what you mean.

Look at the two "solutions" to the German Tank Problem: http://en.wikipedia.org/wiki/German_tank_problem

"The analyses illustrate the difference between frequentist inference and Bayesian inference.
Estimating the population maximum based on a single sample yields divergent results, while the estimation based on multiple samples is an instructive practical estimation question whose answer is simple but not obvious."

A complete LOD Cloud has "frequentist inference" labels, the LOD Cloud the hecklers want to build adds "Bayesian inference" (aka spam or spinning or semantic) labels.  So what's the right answer ?  The right answer is that the Bayesian inference folks want you to speak <predicate>German</predicate> like them and frequentist inference folks just want to count Tanks correctly.

The frequentist-istas  are boring, with just a single answer (transformation) and they insist on spewing normative information all over the Universe.  No wonder semantic hipsters mock them.  Newton, Einstein, Fermi, Dirac, Feynman ... all losers ... not smart enough to make up their own labels for things.  Chaos and Informative data sets FOREVER!



On Thu, 4/23/15, Laurens Rietveld <laurens.rietveld@vu.nl> wrote:

 Subject: Algorithm evaluation on the complete LOD cloud?
 To: "public-lod@w3.org" <public-lod@w3.org>, "SW-forum Web" <semantic-web@w3.org>
 Date: Thursday, April 23, 2015, 6:21 AM
 Hi all,
 I'm doing some research on evaluating
 algorithms on the complete LOD cloud (via http://lodlaundromat.org),
 and am looking for existing papers and algorithms to
 The criteria for such an algorithm
 are:It should be open
 sourceDomain independentNo dependency on
 third data sources, such as query logs or a gold
 standardNo particular hardware dependencies (e.g. a
 cluster)The algorithm should take a dataset as
 input, and produce results as
 output Many thanks in advance for any
 suggestionsBest, Laurens
 VU University
 AmsterdamFaculty of Exact
 SciencesDepartment of Computer
 ScienceDe Boelelaan 1081
 A1081 HV
 address: De Boelelaan
 1081Science Building Room
Received on Thursday, 23 April 2015 17:14:28 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:42 UTC