- From: Jon Awbrey <jawbrey@oakland.edu>
- Date: Thu, 25 Jan 2001 09:24:01 -0500
- To: Joshi Mukul Madhukar <mukul@cse.iitb.ernet.in>
- CC: machine-learning@egroups.com, Arisbe <arisbe@stderr.org>, RDF Logic <www-rdf-logic@w3.org>, SemioCom <semiocom@listbot.com>
Joshi, Mukul Madhukar wrote: > > Hi, > > What is the intution or reason behind using the log term in Entropy > used in decesion trees? From a quick thought ... it gives a mapping > to a value between 0 and 1. Any deep thought?? > Thanks > > ~ Mukul > > Seek simplicity and Distrust it. > > -------------------------------------------------------------- > Mukul Madhukar Joshi. > MTech Student, > Computer Science Department. > Room No. 69, Hostel No. 1 > Indian Institute Of Technology, Powai. > -------------------------------------------------------------- ¤~~~~~~~~~¤~~~~~~~~~¤~~~~~~~~~¤~~~~~~~~~¤~~~~~~~~~¤ Mukul, Taking logarithms is merely for the convenience of converting (what is probably the more natural) multiplicative measure of diversity or of variety to an additive measure, and it is the taking of averages over the appropriate denominator that gets the range of the entropy measure back to the interval [0, 1]. The history of our ideas about information in relationship to notions of entropy or uncertainty is really quite fascinating. Most people are unaware that C.S. Peirce was lecturing on the subject that he called the "Theory of Information" at Harvard as early as 1865. He initially distilled his earliest theory from a matrix (raw material) of purely logical considerations, if you count semiotic (the theory of signs) under the heading of logic, and he frequently employed a multiplicative measure of "multiplicity" as the simplest way to quantify uncertainty with respect to a multitude or a variety of choices. Because many of these multiplicities were generated or represented as the counting of functions, say, of the form {f : X -> Y}, and since the number of functions in this type of "function space" is given by |Y|^|X|, where |S| = Card(S) = "Cardinality of S", also, since it is a quite common occurrence in simple problem settings to work over the same basis |Y| for extended periods of time, taking the various and sundry pieces of "information" that arise as affording "constraints" on how many options one has to consider, it was rather natural to detach the exponent, namely, the proportionate fraction of |X|, that characterized the set of possibilities that one currently had to worry over in deciding the answer to a question or the action to realize. This is tantamount to taking logarithmic images on a base |Y|. I hope this explanation is not too simple to distrust entirely! As it it happens, I was just discussing this very same subject in one of my other e-fora, so I will forward you a copy of how I put it there, under a "FYSMI" (Funny You Should Mention It!) subject line cover. Thanks For The Very Interesting Question! May You Have Many Happy Gedankencounters! Looking Forward Tuit, Jon Awbrey ¤~~~~~~~~~¤~~~~~~~~~¤~~~~~~~~~¤~~~~~~~~~¤~~~~~~~~~¤
Received on Thursday, 25 January 2001 09:23:57 UTC