2,2,2,2,2, 8,8,8,8,8 then the average is 5
which really isnt representative of the data at all .. if you had another data set of 4,4,4,4,4,6,6,6,6,6 .. then average is also 5 but they are very different feels to the data .. so the average is very crude.
I found that a good way to look at how wide the spread of data is, is to look at the interquartile range .. this puts the readings in sequence then you break the stream into 4 equal sized runs .. see notes below on range, median, interquartile given by my dad and solved for the example:
Range
The range of a set of numbers is the difference between the largest and the smallest number.
Example:
Calculate the range of the following numbers:
204, 210, 215, 220, 225, 234, 238, 240
The range
= the largest number – the smallest number
= 240 – 204
= 36
Quartiles
-Q2 (the middle quartile) is the median.
-Q1 (the lower quartile) is the median of the numbers to the left of, or below Q2.
-Q3 (the upper quartile) is the median of the numbers to the right of, or above Q2.
Example:
12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32
Find the lower, middle and upper quartiles of the data above.
Since the data is already in ascending order, identify the median.
12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32
22 is the median, therefore, Q2= 22
The median of the numbers to the left of Q2: 12, 14, 16, 18, 20
16 is the median, therefore, Q1 = 16
The median of the numbers to the right of Q2: 24, 26, 28, 30, 32
28 is the median, therefore, Q3 = 28
Interquartile Range
The interquartile range of a distribution is the difference between the upper and lower quartiles.
That is, interquartile range = Q3 – Q1
Therefore using the example above, the interquartile range is:
Interquartile range = Q3 – Q1
Since,
Q3 = 28
Q1 = 16
Interquartile range
= 28 – 16
= 12
(The following is taken from the web, broken down into easier to manage facts for myself)
The Standard Deviation measures the degree of tightness or spread of the data .. ... so this is very useful and if sd is tight gives more Confidence in what your results really mean .. so look at doing SDs for your data sets .. there is a simple standard formula for this: it is the square root of the Variance. What is the Variance?
Variance
The Variance is defined as:
The average of the squared differences from the Mean.
To calculate the variance follow these steps:
- Work out the Mean (the simple average of the numbers)
- Then for each number: subtract the Mean and square the result (the squared difference).
- Then work out the average of those squared differences. (Why Square?)
Example
You and your friends have just measured the heights of your dogs (in millimeters):
The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and 300mm.
Find out the Mean, the Variance, and the Standard Deviation.
Your first step is to find the Mean:
Answer:
Mean = |
600 + 470 + 170 + 430 + 300
| = |
1970
| = 394 |
5
|
5
|
so the mean (average) height is 394 mm. Let's plot this on the chart:
Now, we calculate each dogs difference from the Mean:
To calculate the Variance, take each difference, square it, and then average the result:
So, the Variance is 21,704.
And the Standard Deviation is just the square root of Variance, so:
Standard Deviation: σ = √21,704 = 147.32... = 147 (to the nearest mm)
And the good thing about the Standard Deviation is that it is useful. Now we can show which heights are within one Standard Deviation (147mm) of the Mean:
So, using the Standard Deviation we have a "standard" way of knowing what is normal, and what is extra large or extra small.
Rottweilers are tall dogs. And Dachshunds are a bit short .
No comments:
Post a Comment