***
*
PAGE RETIRED: Click here for the new StatsDirect help system.
*
OR YOU WILL BE REDIRECTED IN 5 SECONDS
*
***
Menu location: Analysis_Parametric_Normality.
This function enables you to explore the distribution of a sample and test for certain patterns of nonnormality.
A normal probability plot is provided, after some basic descriptive statistics and five hypothesis tests. The common hypothesis is that the sample has been drawn at random from a normal distribution. The test of skewness focuses on the symmetry of the distribution; the test of kurtosis examines it's peakedness; and the omnibus chisquare (or ksquare) test combines these characteristics (D'Agostino et al., 1990). These tests are calculated using moments about the mean, therefore they are quite sensitive to outliers. You might wish to clean your dataset carefully before exploring its distribution in this way. The two other tests are semiparametric analyses of variance: ShapiroWilk W (Conover, 1999; Shapiro and Wilk, 1965; Royston, 1982a, 1982b, 1991a, 1995) and ShapiroFrancia W' (Shapiro and Francia, 1972; Royston 1983).
StatsDirect requires a random sample of between 3 and 2,000 for the ShapiroWilk test, or between 5 and 5,000 for the ShapiroFrancia test. The omnibus chisquare test can be used with larger samples but requires a minimum of 8 observations.
In addition to P values the semiparametric tests provide V or V' which have a value approximately equal to 1 for samples from normal distributions. Large values of V or V' indicate nonnormality. 95% critical values for V lie between 1.2 and 2.4 depending on sample size, and 2.0 and 2.8 for V' (Royston 1991a).
The omnibus chisquare test is updated from the original method to adjust for the fact that the test statistic is not quite distributed as chisquare with 2 degrees of freedom (Royston 1991b).
Significant P values for any of the tests indicate nonnormality. There is no perfect test for all forms of nonnormality therefore we have provided several here. In addition to the tests you should always examine the normal probability plot. Deviations from the straight line on the plot indicate nonnormality. Also look at histograms and spread/dotplots to examine the distribution of your data. Investigate outlying observations and consider excluding them. You might also wish to transform your data, e.g. into natural logarithms, then examine the normality of the transformed data.
Example
Test workbook (Parametric worksheet: Penicillin).
Consider the following 30 penicillin yields.
penicillin
0.0987 
0.0000 
0.0533 
0.0026 
0.0293 
0.0036 
0.0246 
0.0042 
0.0200 
0.0114 
0.0194 
0.0139 
0.0191 
0.0222 
0.0180 
0.0333 
0.0172 
0.0348 
0.0132 
0.0363 
0.0102 
0.0363 
0.0084 
0.0402 
0.0077 
0.0583 
0.0058 
0.1184 
0.0016 
0.1420 
To test these data for nonnormality using StatsDirect you must first prepare them in a workbook column. Alternatively, open the test workbook using the file open function of the file menu. Then select the normality test from the parametric methods section of the analysis menu. Select the column marked "Penicillin" when prompted.
For this example:
Normality 

Sample name 
Penicillin 
Sample size 
30 
Mean 
0.007033 
Standard deviation 
0.0454 


Skewness 
0.942148, P = 0.025 
Kurtosis 
5.373413, P = 0.0156 
Royston chisq 
9.081201, P = 0.0107 
ShapiroWilk W 
0.892516, V = 3.416357, P = 0.0055 
ShapiroFrancia W' 
0.873776, V' = 4.430639, P = 0.0035 


Sample unlikely to be from a normal distribution 
Here the test statistics are clearly significant at P = 0.05 which rejects the null hypothesis that these data are from a normal distribution. The departures from the straight line in the normal plot demonstrate this pattern in more detail. In fact, these data were from a 2 by 5 factor grouping experiment.