PAGE RETIRED: Click here for the new StatsDirect help system.





Variance, standard deviation and spread


The standard deviation of the mean (SD) is the most commonly used measure of the spread of values in a distribution. SD is calculated as the square root of the variance (the average squared deviation from the mean).


Variance in a population is:


[x is a value from the population, m is the mean of all x, n is the number of x in the population, S is the summation]


Variance is usually estimated from a sample drawn from a population. The unbiased estimate of population variance calculated from a sample is:


[x is an observation from the sample, x-bar is the sample mean, n (sample size) -1 is degrees of freedom, S is the summation]


The spread of a distribution is also referred to as dispersion and variability. All three terms mean the extent to which values in a distribution differ from one another.


SD is the best measure of spread of an approximately normal distribution. This is not the case when there are extreme values in a distribution or when the distribution is skewed, in these situations interquartile range or semi-interquartile are preferred measures of spread. Interquartile range is the difference between the 25th and 75th centiles. Semi-interquartile range is half of the difference between the 25th and 75th centiles. For any symmetrical (not skewed) distribution, half of its values will lie one semi-interquartile range either side of the median, i.e. in the interquartile range. When distributions are approximately normal, SD is a better measure of spread because it is less susceptible to sampling fluctuation than (semi-)interquartile range.


If a variable y is a linear (y = a + bx) transformation of x then the variance of y is b² times the variance of x and the standard deviation of y is b times the variance of x.


The standard error of the mean is the expected value of the standard deviation of means of several samples, this is estimated from a single sample as:


[s is standard deviation of the sample mean, n is the sample size]


See descriptive statistics.