- From: Gannon Dick <gannon_dick@yahoo.com>
- Date: Fri, 24 Apr 2015 09:23:07 -0700
- To: Paul Houle <ontology2@gmail.com>
- Cc: "public-lod@w3.org" <public-lod@w3.org>, SW-forum Web <semantic-web@w3.org>, Laurens Rietveld <laurens.rietveld@vu.nl>
@Gannon here. Apologies Paul, my sarcasm went a bit over the top. If only new "creation" of list labels (data definitions) is considered, then there is only one choice of structure for a "any large collection of *well* organized RDF data." <rdf:list> <rdf:first>Sum partial fractions e.g. a Ground State</rdf:first> <rdf:rest>re-normalization group fraction</rdf:rest> <rdf:rest>re-normalization group fraction</rdf:rest> <rdf:rest>re-normalization group fraction</rdf:rest> ... <rdf:nil /> </rdf:list> Semantic data does not need ground state change (Bayesian inference) to be useful. Inflation as homage to the "Open World Assumption" does much harm to insight. No need to subject the dynamics to continuous compounding (change of Radix in LOG Space); because it is already there. --Gannon -------------------------------------------- On Fri, 4/24/15, Paul Houle <ontology2@gmail.com> wrote: Subject: Re: Algorithm evaluation on the complete LOD cloud? To: "Gannon Dick" <gannon_dick@yahoo.com> Cc: "public-lod@w3.org" <public-lod@w3.org>, "SW-forum Web" <semantic-web@w3.org>, "Laurens Rietveld" <laurens.rietveld@vu.nl> Date: Friday, April 24, 2015, 9:39 AM Here is my take. The "Complete LOD cloud" is a stand-in for "any large collection of poorly organized RDF data." If you believe that RDF is a good model for representing other sorts of data, you could imagine that some big organization like Citibank or the U.S. Military has a large number of different divisions that have all sorts of data of various quality. In fact if I look at all the files I have on my SOHO network you could say the same is true for individuals and small biz too. Then the right question to ask is "What Methods would one use to characterize such a data set with little prior knowledge?" That is a carefully chosen phrase. @Gannon rails against frequentism, and there are a number of ways to reach a similar conclusion, such as * the "grounding problem" in classical semantics* the fact that any useful or interesting semantic system has to do something or other that is competitive with some way of doing something that is better in some way (i.e. if you don't know where you are going you are going to wind up nowhere) Also I find the "no special hardware requirements" thing to be strange, probably because it ought to be defined in terms of "I have a machine with these specific specifications". For instance, if you had a machine with 32GB of RAM (which is pretty affordable if you don't pay OEM prices) you could load a billion triples into a triple store. If your machine is a hand-me-down laptop from a salesman who couldn't sell that has just 4GB of RAM you are in a very different situation. On Thu, Apr 23, 2015 at 1:14 PM, Gannon Dick <gannon_dick@yahoo.com> wrote: Hi Laurens, Ignore the hecklers, I know what you mean. Look at the two "solutions" to the German Tank Problem: http://en.wikipedia.org/wiki/German_tank_problem "The analyses illustrate the difference between frequentist inference and Bayesian inference. Estimating the population maximum based on a single sample yields divergent results, while the estimation based on multiple samples is an instructive practical estimation question whose answer is simple but not obvious." A complete LOD Cloud has "frequentist inference" labels, the LOD Cloud the hecklers want to build adds "Bayesian inference" (aka spam or spinning or semantic) labels. So what's the right answer ? The right answer is that the Bayesian inference folks want you to speak <predicate>German</predicate> like them and frequentist inference folks just want to count Tanks correctly. The frequentist-istas are boring, with just a single answer (transformation) and they insist on spewing normative information all over the Universe. No wonder semantic hipsters mock them. Newton, Einstein, Fermi, Dirac, Feynman ... all losers ... not smart enough to make up their own labels for things. Chaos and Informative data sets FOREVER! --Gannon -------------------------------------------- On Thu, 4/23/15, Laurens Rietveld <laurens.rietveld@vu.nl> wrote: Subject: Algorithm evaluation on the complete LOD cloud? To: "public-lod@w3.org" <public-lod@w3.org>, "SW-forum Web" <semantic-web@w3.org> Date: Thursday, April 23, 2015, 6:21 AM Hi all, I'm doing some research on evaluating algorithms on the complete LOD cloud (via http://lodlaundromat.org), and am looking for existing papers and algorithms to evaluate The criteria for such an algorithm are:It should be open sourceDomain independentNo dependency on third data sources, such as query logs or a gold standardNo particular hardware dependencies (e.g. a cluster)The algorithm should take a dataset as input, and produce results as output Many thanks in advance for any suggestionsBest, Laurens -- VU University AmsterdamFaculty of Exact SciencesDepartment of Computer ScienceDe Boelelaan 1081 A1081 HV AmsterdamThe Netherlandswww.laurensrietveld.nllaurens.rietveld@vu.nl Visiting address: De Boelelaan 1081Science Building Room T312 -- Paul Houle Applying Schemas for Natural Language Processing, Distributed Systems, Classification and Text Mining and Data Lakes (607) 539 6254 paul.houle on Skype ontology2@gmail.comhttps://legalentityidentifier.info/lei/lookup
Received on Friday, 24 April 2015 16:23:38 UTC