My Math Forum  

Go Back   My Math Forum > High School Math Forum > Probability and Statistics

Probability and Statistics Basic Probability and Statistics Math Forum


Thanks Tree5Thanks
Reply
 
LinkBack Thread Tools Display Modes
March 12th, 2017, 01:59 AM   #1
Member
 
Joined: May 2015
From: Australia

Posts: 42
Thanks: 5

Why is my Standard Deviation larger than the mean?

Hello,

I have an assignment asking me to calculate the mean and standard deviation for Weekly housing cost.

All of the data came from 306 students who filled in the survey.

I was given raw data in excel.
I used excel to calculate the mean by typing '=average(' into the formula bar and selecting the appropriate category.
I also used excel to calculate the standard deviation by typing '=stdev(' into the formula bar and selecting the appropriate category.

I attached a notepad file with the raw data (i couldnt attach the excel file with the raw data, but if you'd like to double check my calculations, you can copy and paste the data to excel)

Idont konw why the SD is larger than the mean. It doesn't make sense.
For example, for the 'homecostwk' category, the mean is 84 dollars, and the SD is 175 dollars

That means the data spread is from -91 dollars to 259 dollars

How is that possible when the raw data values aren't even negative? Did I do the calculations wrong?

I had to calculate for other categories too, such as food, entertainment and mobile phone weekly costs. The SD for those categories were also larger than the mean.
Attached Files
File Type: txt homecostwk.txt (1.1 KB, 11 views)
pianist is offline  
 
March 12th, 2017, 06:27 PM   #2
SDK
Senior Member
 
Joined: Sep 2016
From: USA

Posts: 114
Thanks: 45

Math Focus: Dynamical systems, analytic function theory, numerics
There is exactly zero relationship between the standard deviation and the mean. There is no reason to expect the standard deviation to be smaller than the mean in general. Can you explain why you think it should be smaller?
Thanks from pianist
SDK is offline  
March 12th, 2017, 06:37 PM   #3
Member
 
Joined: May 2015
From: Australia

Posts: 42
Thanks: 5

Thanks so much for your reply.

I'm going to try to explain what I mean.

When I posted this question, I thought that it was odd that the SD was larger than the mean. However, now that I think about it, it is okay for the SD to be larger than the mean.

My concern is that the SD is too large for the mean. For example, regarding the weekly home costs, the mean is 84 dollars. The standard deviation is 175 dollars.

From my understanding, the SD makes the data spread from 84-175= -91 dollars to 84+175= 259 dollars.

This doesn't make any sense to me because how is the standard deviation below the mean a negative value?

The reason that it doesnt make sense to me is that none of the raw data values are negative AND how can it be negative when the data is about home weekly costs?

Maybe I did the calculations incorrectly?
pianist is offline  
March 12th, 2017, 09:45 PM   #4
Senior Member
 
Joined: May 2016
From: USA

Posts: 684
Thanks: 284

Without seeing either the data or your spreadsheet, almost any answer is going to be a guess. If the number of data elements is small and you used the sample formula rather than the population formula for standard deviation, your computation of the standard deviation may be in error.

Let's assume, however, that you made no error in your computation. Your concern would be valid only if negative deviations from the mean were approximately as numerous as positive deviations. But you may have a situation where there are many more negative deviations than positive deviations, in which case the average positive deviation must have a much larger absolute value than the average negative deviation. In other words, the data are skewed. In a sense, the standard deviation is a kind of average of the deviations around the mean, but the absolute values of the average positive deviation and average negative deviation may differ considerably from the standard deviation. If the data are heavily skewed, the standard deviation cannot be used in the way you are using it because it will overestimate deviations in one direction and underestimate them in the other direction.

Socioeconomic data are frequently skewed; thus you need to test your data for that.
Thanks from pianist

Last edited by JeffM1; March 12th, 2017 at 09:52 PM.
JeffM1 is offline  
March 12th, 2017, 09:57 PM   #5
Member
 
Joined: May 2015
From: Australia

Posts: 42
Thanks: 5

Thanks for your reply.

In my assignment I was instructed to use Microsoft Excel to calculate the mean AND standard deviation. This means I used the formula "=average( )" and "=stdev( )" to work out the values.

If it helps to get a better sense of the data I used, i have attached a notepad file containing all 306 raw data values.

The data could be skewed since the lowest data value is 0 dollars and the highest data value is 1154 dollars.

The mean is 84 dollars and SD is 175 dollars.

Does this explain why the SD below the mean is a negative value? I'm still confused about why the SD below the mean is negative.
Attached Files
File Type: txt homecostwk.txt (1.1 KB, 1 views)
pianist is offline  
March 13th, 2017, 04:32 PM   #6
Global Moderator
 
Joined: May 2007

Posts: 6,258
Thanks: 508

The distribution is very unsymmetrical. The standard deviation does not take that into account, so the extreme high values tend to make it bigger.
Thanks from pianist
mathman is offline  
March 13th, 2017, 04:47 PM   #7
Member
 
Joined: May 2015
From: Australia

Posts: 42
Thanks: 5

Thanks for your reply.

Does that mean there are lots of outliers in the data?

The fact that the standard deviation below the mean makes the data negative doesn't really help me I think.

To give you a better idea of what i have to do, I had to create a summary table of the expense categories of last year's students. I couldnt attach a clear image of the table, so I had to create a link for it:

https://www.dropbox.com/s/numj773zua...table.png?dl=0


Then I have to compare my own expenses (I'm basically a frugal university student =)) to the student cohort.

The criteria is that i have to do 'an appropriate comparison'. For example, I could say my weekly income is below the mean. However, I dont know how to compare myself to the standard deviation. How could i utilise the standard deviation in my comparison?

Any suggestions would be greatly appreciated

Last edited by pianist; March 13th, 2017 at 04:51 PM.
pianist is offline  
March 14th, 2017, 06:25 AM   #8
Senior Member
 
Joined: May 2016
From: USA

Posts: 684
Thanks: 284

Wow. You are making a number of assumptions that are either false or misleading.

First, the standard deviation is NEVER negative. It is always non-negative.

Second, if the standard deviation exceeds the mean, it does not necessarily mean that any value in the population is negative. There may or may not be negative values in the population. Only if the mean is zero and the standard deviation is positive does it mean that there necessarily are negative values. (What may be confusing you is that in z-scores, the mean is zero.)

Third, if the mean is far from zero and the standard deviation exceeds the mean, it MAY indicate that the mean is not a helpful measure of central tendency and that either the median or mode provides a more meaningful summary.
Thanks from pianist
JeffM1 is offline  
March 14th, 2017, 06:45 AM   #9
Member
 
Joined: May 2015
From: Australia

Posts: 42
Thanks: 5

Thanks for your reply.

For my task, i have to compare myself to the student cohort in the table below:

https://www.dropbox.com/s/numj773zua...table.png?dl=0

For example, in my comparison, i could say that my weekly income is 244-146=98 dollars below the mean. The table also gives the standard deviation of the student cohort. How could I use the SD of 322 dollars referring to the weekly income to compare to myself?

The standard deviation is provided in the table, so i don't think i should ignore it. However, i dont think i can compare a standard deviation to my individual data value of weekly income. Or is it possible to compare?
pianist is offline  
March 14th, 2017, 10:18 AM   #10
Senior Member
 
Joined: May 2016
From: USA

Posts: 684
Thanks: 284

Basically, you cannot. Income cannot be negative. So what that standard deviation tells you is that some percentage of the students have incomes well in excess of the mean. In this case, the mean is not particularly representative. If the table gave you the median, it would be far more informative.

The implication here is that there are other tables that may break the information down in different ways. For example, there may be a different table that shows the data for the students not employed. You would probably find that the mean and the standard deviation were smaller, perhaps considerably smaller.

You need to understand three things about the standard deviation.

First, it is an average of the deviations so small means small deviations on average and large means large deviations on average. In that sense, it just tells you whether the mean is informative or not. In this example, the mean of the entire population is not very informative.

Second, it is a very common way to average the deviations around the mean. So you need to understand what it is telling about the mean.

Third, many data sets are approximately "normally distributed." If so, the standard derivation lets you know what percentages of the population deviate by how much from the mean. In other words, the standard deviation gives you a lot of information for normally distributed data.
JeffM1 is offline  
Reply

  My Math Forum > High School Math Forum > Probability and Statistics

Tags
deviation, larger, standard



Thread Tools
Display Modes


Similar Threads
Thread Thread Starter Forum Replies Last Post
Average Mean Deviation VS. Standard Deviation John Travolski Probability and Statistics 1 July 14th, 2016 01:21 PM
Standard Deviation tjthedj Probability and Statistics 3 December 18th, 2014 08:09 AM
Standard Deviation bilano99 Algebra 3 March 21st, 2012 10:20 PM
Standard deviation VS Mean absolute deviation Axel Algebra 2 April 28th, 2011 03:25 AM
Standard deviation emtswife Algebra 3 January 31st, 2008 02:21 AM





Copyright © 2017 My Math Forum. All rights reserved.