While knowing the mean value for a set of data may give us some information about the set itself, many varying sets can have the same mean value. To determine how the sets are different, we need more information. Another way of examining single variable data is to look at how the data is spread out, or dispersed about the mean. We will discuss 4 ways of examining the dispersion of data. The smaller the values from these methods, the more consistent the data.
1. Range: The simplest of our methods for measuring dispersion is range. Range is the difference between the largest value and the smallest value in the data set. While being simple to compute, the range is often unreliable as a measure of dispersion since it is based on only two values in the set.
A range of 50 tells us very little about how the values are dispersed.
Are the values all clustered to one end with the low value (12) or the high value (62) being an outlier?
Or are the values more evenly dispersed among the range?
Are the values all clustered to one end with the low value (12) or the high value (62) being an outlier?
Or are the values more evenly dispersed among the range?
-----------------------------------------------------------------------------------------------------------
Before discussing our next methods, let's establish some vocabulary:
Population form: | Sample form: |
The population form is used when the data being analyzed includes the entire set of possible data. When using this form, divide by n, the number of values in the data set.
All people living in the US. |
The sample form is used when the data is a random sample taken from the entire set of data. When using this form, divide by n - 1.(It can be shown that dividing by n - 1 makes S2 for the sample, a better estimate of for the population from which the sample was taken.)
Sam, Pete and Claire who live in the US. |
The population form should be used unless you know a random sample is being analyzed.
|
-----------------------------------------------------------------------------------------------------------
2. Mean Absolute Deviation (MAD):
The mean absolute deviation is the mean (average) of the absolute value of the difference between the individual values in the data set and the mean. The method tries to measure the average distances between the values in the data set and the mean.
Tiada ulasan:
Catat Ulasan