Variance, Standard Deviation and Spread

The standard deviation of the mean (SD) is the most commonly used measure of the spread of values in a distribution. SD is calculated as the square root of the variance (the average squared deviation from the mean).

Variance in a population is:

[x is a value from the population, μ is the mean of all x, n is the number of x in the population, Σ is the summation]

Variance is usually estimated from a sample drawn from a population. The unbiased estimate of population variance calculated from a sample is:

[x_i is the ith observation from a sample of the population, x-bar is the sample mean, n (sample size) -1 is degrees of freedom, Σ is the summation]

The spread of a distribution is also referred to as dispersion and variability. All three terms mean the extent to which values in a distribution differ from one another.

SD is the best measure of spread of an approximately normal distribution. This is not the case when there are extreme values in a distribution or when the distribution is skewed, in these situations interquartile range or semi-interquartile are preferred measures of spread. Interquartile range is the difference between the 25th and 75th centiles. Semi-interquartile range is half of the difference between the 25th and 75th centiles. For any symmetrical (not skewed) distribution, half of its values will lie one semi-interquartile range either side of the median, i.e. in the interquartile range. When distributions are approximately normal, SD is a better measure of spread because it is less susceptible to sampling fluctuation than (semi-)interquartile range.

If a variable y is a linear (y = a + bx) transformation of x then the variance of y is b² times the variance of x and the standard deviation of y is b times the variance of x.

The standard error of the mean is the expected value of the standard deviation of means of several samples, this is estimated from a single sample as:

[s is standard deviation of the sample mean, n is the sample size]

See descriptive statistics.