 August 22nd, 2011, 01:51 AM #1 Newbie   Joined: Aug 2011 Posts: 2 Thanks: 0 Calculating information gain (decision trees) Hi guys! I have a problem with the calculation of my information gain of my decision tree. During the lectures, we saw a rather easy example. However, I'm not able to come to the correct solution and it's frustrating me! The decision tree deals with whether or not to play outside, according to the weather conditions, and the example goes as follows: Attribute Outlook (sunny): - yes - yes - yes - no - no Put another way: info[2,3] = entropy(2/5, 3/5) = -2/5log(2/5)-3/5log(3/5) = 0.971 bits However, when I type -2/5log(2/5)-3/5log(3/5) on my calculator, I never get 0.971 bits... I know I'm doing something wrong but I can't seem to remember the rules according to logarithms. Can someone tell me how to come to the correct answer? Thanks!
 August 22nd, 2011, 12:52 PM #2 Global Moderator     Joined: Nov 2006 From: UTC -5 Posts: 16,046 Thanks: 938 Math Focus: Number theory, computational mathematics, combinatorics, FOM, symbolic logic, TCS, algorithms Re: Calculating information gain (decision trees) You're probably using the wrong logarithm base. Base 2 gives bits; base e gives nats; etc. You can convert a logarithm of any base to base 2 by dividing the result by log (in that base) of 2.
 August 22nd, 2011, 01:49 PM #3 Newbie   Joined: Aug 2011 Posts: 2 Thanks: 0 Re: Calculating information gain (decision trees) Ok thanks! I typed the following on my calculator and now it works: -(2/5)*(log(2/5)/log(2))-(3/5)*(log(3/5)/log(2)) = 0.971 Again thanks!
 August 22nd, 2011, 04:44 PM #4 Global Moderator     Joined: Nov 2006 From: UTC -5 Posts: 16,046 Thanks: 938 Math Focus: Number theory, computational mathematics, combinatorics, FOM, symbolic logic, TCS, algorithms Re: Calculating information gain (decision trees) No problem. I had answered the question earlier, actually, but it was lost in the server move.

