Discussion about math, puzzles, games and fun. Useful symbols: ÷ × ½ √ ∞ ≠ ≤ ≥ ≈ ⇒ ± ∈ Δ θ ∴ ∑ ∫ π -¹ ² ³ °

You are not logged in.

- Topics: Active | Unanswered

Pages: **1**

**mathaholic****Member**- From: Earth
- Registered: 2012-11-29
- Posts: 3,010

http://www.mathsisfun.com/data/standard-deviation.html

If you want to use a source, click the link above .

So, the standard deviation is the square root of the variance.

Example :

Bobbym works at Pizza Hut. He wants to calculate the standard deviation of his weekly earnings.

Here is how much Bobbym earned this week :

Monday - $1,090

Tuesday - $918

Wednesday - $992

Thursday - $1,101

Friday - $884

Saturday - $982

Sunday - $1,000

So, we need to get the variance first.

To do that, we need to get the **mean**.

1 ) 1090 + 918 + 992 + 1101 + 884 + 982 + 1000

2 ) 2008 + 2093 + 1866 + 1000

3 ) 4101 + 2866

4 ) 6967

5 ) 6967 / 7 = 995

Now, let's subtract the mean for each number.

We will use the number in #5. And we will have an estimated answer.

1090 - 995 = 95

918 - 995 = -77

992 - 995 = -3

1101 - 995 = 106

884 - 995 = -111

982 - 995 = -13

1000 - 995 = 5

Now, let us square them :

95 = 9025

77 = 5929

3 = 9

106 = 11236

111 = 12321

13 = 169

5 = 25

Now, the average of the numbers above :

9025 + 5929 + 9 + 11236 + 12321 + 169 + 25

14954 + 11245 + 12490 + 194

26199 + 12684

38883 / 7 = 5554

5554 is the variance.

Now, for the standard deviation :

Square root of 5554 = 74

So, 74 is the standard deviation.

Here, you may discuss about the topic, or give more examples.

Mathaholic | 17th most active poster | 681st most recent person to join MIFF | Person | rrr's classmate

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 96,571

I would still be working there if the salary was that good. They fired me when I told them their pizza tasted like paper with some old ketchup on it.

I am getting 80 or so for the standard deviation.

**In mathematics, you don't understand things. You just get used to them.**

**If it ain't broke, fix it until it is.**

Offline

**mathaholic****Member**- From: Earth
- Registered: 2012-11-29
- Posts: 3,010

Are you just pretending to be fired?

Mathaholic | 17th most active poster | 681st most recent person to join MIFF | Person | rrr's classmate

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 96,571

Nope. I was usually fired before I could quit.

**In mathematics, you don't understand things. You just get used to them.**

**If it ain't broke, fix it until it is.**

Offline

**mathaholic****Member**- From: Earth
- Registered: 2012-11-29
- Posts: 3,010

Oh. Why 80? Are you getting the actual standard deviation?

Mathaholic | 17th most active poster | 681st most recent person to join MIFF | Person | rrr's classmate

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 96,571

I used a program to get it and rounded it.

**In mathematics, you don't understand things. You just get used to them.**

**If it ain't broke, fix it until it is.**

Offline

**mathaholic****Member**- From: Earth
- Registered: 2012-11-29
- Posts: 3,010

Oh. Mathematica again?

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 96,571

If you want I can do the calculation by hand.

**In mathematics, you don't understand things. You just get used to them.**

**If it ain't broke, fix it until it is.**

Offline

**mathaholic****Member**- From: Earth
- Registered: 2012-11-29
- Posts: 3,010

Just go ahead. Use Mathematica to round off the standard deviation.

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 96,571

Hi;

Without the senseless rounding I am getting

80.32582458486246

**In mathematics, you don't understand things. You just get used to them.**

**If it ain't broke, fix it until it is.**

Offline

**Mrwhy****Member**- Registered: 2012-07-02
- Posts: 52

Do you realise that if one of those daily numbers was a misprint then its error is exaggerated by taking its square.

Mean Absolute deviation is better

Indeed in experimental science (all data!) mean of the cube root of the deviation gives a more reliable answer as it gives more weight to the numbers whose deviation is smallest (more carefully measured?)

Offline

**bob bundy****Moderator**- Registered: 2010-06-20
- Posts: 7,052

hi julianthemath and bobbym

See below for my calculations. There are 7 values so once the sum of the squared deviations has been determined this should be divided by 7.

So I agree with julianthemath.

The value of 80 is sum/6 which is used to determine an unbiassed estimate of a population sd for a sample size n.

julianthemath wrote:

He wants to calculate the standard deviation of his weekly earnings.

Maybe he should have said ".......of his earnings for one week", but, from what bobbym has told us, he didn't work there any longer than that because he got fired!

So, in this case, we know the whole population, => dividing by 7 is appropriate.

http://www.mathsisfun.com/data/standard … mulas.html

Bob

You cannot teach a man anything; you can only help him find it within himself..........Galileo Galilei

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 96,571

Hi All;

Yes, we could discuss what sd is appropriate until they rehire me but what I was after was this:

9025+5929+9+11236+12321+169+25 = 38714

14954+11245+12490+194 = 38883

The little fellow seems to have found a way to pair off 7 numbers into 4 distinct pairs? Now to mention that in post #1276 seems picayune but in post #2 it was okay.

**In mathematics, you don't understand things. You just get used to them.**

**If it ain't broke, fix it until it is.**

Offline

**bob bundy****Moderator**- Registered: 2010-06-20
- Posts: 7,052

hi bobbym,

Arhh, I see what you mean (pun not intended). He makes lots of approximations and adds in 169 twice. He was lucky to get 74 at the end.

But that doesn't change my opinion on what to divide by.

At

http://www.mathsisfun.com/data/standard-deviation.html

we are encouraged to think there are two formulas for variance.

At

http://www.mathsisfun.com/data/standard … mulas.html

the reason for this is explained more fully.

See also

http://en.wikipedia.org/wiki/Variance

As I understand it, there is only one formula for calculating variance and that involves dividing by n. (There is a simplification that is useful if you are doing it without the help of an electronic device that stores all the values.)

I think the formula with (n-1) is only for the purpose of getting an unbiased estimate of a population variance if all you know is the sample variance. This arises because, with a sample (which by its nature has less values than the whole population) there is less variation in the values resulting in a variance that is expected to be lower than the true population variance by a factor (n-1)/n

To remove this bias, if you want to estimate the population variance, you take the sample variance, s^2, and multiply it by an unbiassing factor of n/(n-1). Since the last step in calculating s^2 would have been to divide by n, you might as well save yourself the effort and cancel out the ns altogether and just divide by (n-1).

Most statistical packages will have both options (divide by n and divide by n-1) and the user has to know which one to use when.

If julianthemath wanted to estimate the population variance he would have to start with a random sample. Choosing seven days in order is hardly random. I conclude he wanted the variance of just those values => divide by 7.

Bob

You cannot teach a man anything; you can only help him find it within himself..........Galileo Galilei

Offline

**bobbym****Administrator**- From: Bumpkinland
- Registered: 2009-04-12
- Posts: 96,571

Hi Bob;

Okay, we will divide by 7.

**In mathematics, you don't understand things. You just get used to them.**

**If it ain't broke, fix it until it is.**

Offline

**mathaholic****Member**- From: Earth
- Registered: 2012-11-29
- Posts: 3,010

Offline

**bob bundy****Moderator**- Registered: 2010-06-20
- Posts: 7,052

hi julianthemath

jtm wrote:

I've been waiting for a response from you.

Wolfram says this is an area that is often confused. I'm not confused. So I'm happy to have a go at explainig this if you want.

Just reply, "Yes,please".

Bob

You cannot teach a man anything; you can only help him find it within himself..........Galileo Galilei

Offline

**mathaholic****Member**- From: Earth
- Registered: 2012-11-29
- Posts: 3,010

Yup.

bob bundy wrote:

I've been waiting for a response from you.

Wolfram says this is an area that is often confused. I'm not confused. So I'm happy to have a go at explainig this if you want.

Just reply, "Yes,please".

Bob

jtm wrote:bobbym wrote:Okay, we will divide by 7.

Offline

**bob bundy****Moderator**- Registered: 2010-06-20
- Posts: 7,052

OK, but it will have to be later. I'm part way through setting up an arch in my garden and only came in for a coffee break. I'll have a go this evening for you. (My time now is about 11am, BST.)

Bob

Offline

**bob bundy****Moderator**- Registered: 2010-06-20
- Posts: 7,052

Variance formulas.

Sorry, this has turned out to be quite a long post. Hope you manage to stay awake to the end.

The definition formula is

That's the one you used.

It is possible to do some algebra on that to get an alternative version that gives the same result

This formula is handy if you are having to calculate the variance on paper as you don't have to work out the mean before you start doing the sum of squares total, so it saves time. It is also used by calculators that have statistical functions as it only requires three memories to do the calculations: one for the running count, n; one for the running total of x; and one for the running total of x squared. When you have entered all the data (which the calculator doesn't have to remember) there are built in functions to calculate the mean and variance.

Now if you have a set of data and just want to get the variance, use either of the above.

But what happens if you just take a sample of values, calculate the sample mean and variance as usual; but now want to estimate the 'population' statistics?

eg. You have sampled the weight of bags of sugar coming off a production line, and now want to say what the mean and variance are for the whole production. Can you use the sample statistics?

The answer is YES and NO.

What do I mean by that? Well, imagine you keep taking samples and computing the mean and variance for each sample. The sample means will be symmetrically clustered around the population mean (this can be proved by something called 'expectation algebra' but it is a complicated proof so I'd rather not go into it.) So taking any particular sample mean won't give you the true population mean but it is said to be an unbiased estimator for it. By which I mean, there's no bias in taking one value; it may be too high; it may be too low; but these are equally likely. So you may use the sample mean as an estimate for the population mean.

However, the same is not true for the sample variance. If you repeat the sampling many times and compute the variance each time, you again get a set of results that are symmetrically clustered about a fixed value; but that value is not the population variance. The value you get will tend to be too low. I like to think of it like this: you've only taken a few samples from the population so there's less of a spread in the results than if you took the whole population. This leads to a bias if you take the sample variance as the population variance. But the bias is by a predictable amount!

Expectation algebra shows that the mean of the sample variances (let's call it s^2) is given by this formula:

As you can see this leads to a variance that is too low, but only by a tiny amount when n is very large. So if you take a big sample you could use the sample variance for the population variance and it probably wouldn't matter; but if n is small, it would because you'd be using a variance that is too small.

But you can easily unbias it. If you multiply the sample variance by

you unbias it by just the right factor. So you could calculate the sample variance, and then adjust it by this multiplier. But, as the last step in calculating a variance is to divide by n, you can save some steps.

This formula is often called (incorrectly) the sample variance formula. It isn't. It is the formula for estimating the population variance from a set of sample data. Calculators and math packages will probably have it labelled as s^2 but, hopefully, you can see this is not quite correct.

So, back to your original post. You said

Bobbym works at Pizza Hut. He wants to calculate the standard deviation of his weekly earnings.

Here is how much Bobbym earned this week :

Now it is debatable that what you meant was " he wants to use this sample to calculate the standard deviation for all of his earnings" in which case you would divide the sum of the squared deviations from the mean by 6. But it isn't what you said. In any case, the estimator formula is only valid if you take a random sample across all his earnings. Taking values from just one week isn't random because sales may have been poor at that time of the year leading to poor earnings. Or maybe this was an early week in his employment when he was keen and hard working, before he became cynical and disgruntled and ended up getting the sack. There is lots of potential for introducing bias if you just take 7 days, one after the other.

My Conclusion: you were right to divide by 7.

But, recommendation: Don't round off early in the calculation; maintain all the figures until the end and then round off. You were lucky to get 74 after all that rounding and one calculation error.

Bob

Offline

Pages: **1**