
Probability and Statistics Basic Probability and Statistics Math Forum 
 LinkBack  Thread Tools  Display Modes 
January 25th, 2017, 06:47 PM  #1 
Senior Member Joined: Oct 2013 From: New York, USA Posts: 647 Thanks: 85  How Close To The Normal Distribution Is My Data?
I have 351 numbers. About half are on each side of the mean. 169 are higher than the mean and 182 are lower than the mean. Here is how my data compares to the normal distribution in terms of what percent of the numbers are within 0.5, 1, and 2 standard deviations of the mean: Within 0.5 standard deviations: exactly 1/3rd of my numbers, 38.3% for the normal distribution Within 1 standard deviation: 63.5% of my numbers, 68.3% for the normal distribution Within 2 standard deviations: 97.2% of my numbers, 95.4% for the normal distribution The farthest from the mean any of my numbers are is 2.5334 standard deviations above the mean, so all of my numbers are within 3 standard deviations of the mean. It's obvious that my numbers are less likely than the normal distribution to be within 1 standard deviation of the mean and more likely than the normal distribution to be between 1 and 2 standard deviations away from the mean. The highest 35 numbers (about 10 percent of 351) are an average of 1.728 standard deviations above the mean. The lowest 35 numbers are an average of 1.598 standard deviations below the mean. Without looking at all the numbers, would you say the numbers are close to being normally distributed? 
January 26th, 2017, 06:32 AM  #2 
Senior Member Joined: Dec 2012 From: Hong Kong Posts: 853 Thanks: 311 Math Focus: Stochastic processes, statistical inference, data mining, computational linguistics 
From your description, it sure sounds close, but you should do a chisquared test for goodnessoffit (or some other goodnessoffit test) to make sure, since you didn't really provide a lot of info...

January 27th, 2017, 10:58 AM  #3 
Senior Member Joined: Oct 2013 From: New York, USA Posts: 647 Thanks: 85 
I have no idea how to do a chisquare test. https://en.wikipedia.org/wiki/Pearso...isquared_test says "the expected (theoretical) frequency of type i, asserted by the null hypothesis that the fraction of type i in the population is p_{i}" (the symbols might not look right." That sounds like it refers to whole numbers like comparing the observed and expected number of heads from 20 coins, but that's not what I'm working with.

January 28th, 2017, 06:44 PM  #4  
Senior Member Joined: Dec 2012 From: Hong Kong Posts: 853 Thanks: 311 Math Focus: Stochastic processes, statistical inference, data mining, computational linguistics  I would just copy code from the Internet and do it in R. Joking aside... Quote:
 
January 28th, 2017, 07:02 PM  #5 
Senior Member Joined: Oct 2013 From: New York, USA Posts: 647 Thanks: 85 
Is there a way of taking an amount of numbers, mean, and standard deviation, and having a website generate what all the numbers would be if they were normally distributed?

January 28th, 2017, 07:09 PM  #6 
Senior Member Joined: Dec 2012 From: Hong Kong Posts: 853 Thanks: 311 Math Focus: Stochastic processes, statistical inference, data mining, computational linguistics  The website wouldn't know what the bins you want are... C'mon, you can generate what you want in R or Excel 
January 29th, 2017, 07:11 AM  #7 
Senior Member Joined: Oct 2013 From: New York, USA Posts: 647 Thanks: 85  
January 30th, 2017, 04:41 AM  #8  
Senior Member Joined: Dec 2012 From: Hong Kong Posts: 853 Thanks: 311 Math Focus: Stochastic processes, statistical inference, data mining, computational linguistics  I'm sure you have; you just probably didn't know they were called bins. For example, when you create a histogram, you have intervals like (.5, 5.5], (5.5, 10.5], etc., and those are called bins. Quote:
 

Tags 
close, data, distribution, normal 
Thread Tools  
Display Modes  

Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Normal distribution: a probability distribution?  froydipj  Probability and Statistics  3  February 29th, 2016 04:35 PM 
Multivariate normal distribution and marginal distribution  nakys  Advanced Statistics  0  October 3rd, 2013 08:27 AM 
Do this data follow a normal distribution?  jones12  Algebra  0  December 14th, 2012 05:19 AM 
determining the distribution of a data set  magnetpest2k5  Advanced Statistics  1  March 7th, 2011 09:15 AM 
How to determine a set of data follow a certain distribution  winsock  Advanced Statistics  1  May 22nd, 2008 12:35 PM 