Fully Nested Random Analysis of Variance

Menu location: Analysis_Analysis of Variance_Fully Nested.

This function calculates ANOVA for a fully nested random (hierarchical or split-plot) study design. One level of sub-grouping is supported and subgroups may be of unequal sizes. Corrected treatment and subgroup means are given.

You should seek expert statistical guidance before using this method.

If each treatment/exposure group in a study contains treatment/exposure sub-groups then data for nested analysis of variance may be set out as follows:

Hospital 1			Hospital 2
ward 1	ward 2	ward 3	ward 1	ward 2	ward 3
x	x	x	x	x	x	<--- patients
x	x	x	x	x	x
x	x	x	x	x	x
x	x		x	x	x
x		x		x
x				x

The effects (treatments and their subgroups) in this type of study are often random but the same basic calculations are used for models with fixed effects or for mixed models. The variance ratios given are based on a fixed effects model but you can use the mean square results to calculate any other variance ratio of interest. A good account is given by Snedecor and Cochran (1989).

For a fixed effects model use the "F (VR between groups)" statistic.
For a random effects model use the "F (using group/subgroup msqr)" statistic.

Technical Validation

ANOVA for a three factor fully random nested (split-plot) model is calculated as follows (Snedecor and Cochran, 1989):

- where X_ijk is the kth observation from the jth subgroup of the ith group, g is the number of groups, SS_total is the total sum of squares, SS_groups is the sum of squares due to the group factor, SS_{subgroups (group i)} is the sum of squares due to the subgroup factor of group i, si is the number of subgroups in the ith group, n_ij is the number of observations in the jth subgroup of the ith group and N is the total number of observations.

Hocking (1985) describes potential instabilities of this calculation, you should therefore seek expert statistical guidance before using it.

Example

From Snedecor and Cochran (1989).

Test workbook (ANOVA worksheet: P1L1, P1L2, P1L3, P2L1, P2L2, P2L3, P3L1, P3L2, P3L3, P4L1, P4L2, P4L3).

The following data represent calcium measurements from the leaves of turnip greens. The groups represent 4 plants and the subgroups represent 3 leaves taken from each plant. 2 samples were taken from each leaf for calcium measurement.

	Plant 1
leaf 1	leaf 2	leaf 3
x	x	x	<--- sample
x	x	x

To analyse these data in StatsDirect you must first enter them in the workbook using a separate column for each subgroup:

P = plant

L = leaf

P1L1	P1L2	P1L3	P2L1	P2L2	P2L3	P3L1	P3L2	P3L3	P4L1	P4L2	P4L3
3.28	3.52	2.88	2.46	1.87	2.19	2.77	3.74	2.55	3.78	4.07	3.31
3.09	3.48	2.80	2.44	1.92	2.19	2.66	3.44	2.55	3.87	4.12	3.31

Alternatively, open the test workbook using the file open function of the file menu. Then select Fully Nested from the analysis of variance section of the analysis menu. Enter the number of groups as four and then select the four sets of subgroups marked "P1L1" (i.e. Plant 1 Leaf 1) etc.. Each subgroup should be selected by a single selection action.

For this example:

Fully nested/hierarchical random analysis of variance

Variables: (P1L1, P1L2, P1L3) (P2L1, P2L2, P2L3) (P3L1, P3L2, P3L3) (P4L1, P4L2, P4L3)

Source of Variation	Sum Squares	DF	Mean Square
Between Groups	7.560346	3	2.520115
Between Subgroups within Groups	2.6302	8	0.328775
Residual	0.07985	12	0.006654
Total	10.270396	23

F (VR between groups) = 378.727406 P < .0001

F (using group/subgroup msqr) = 7.665167 P = .0097

F (VR between subgroups within groups) = 49.408892 P < .0001

The "F (VR between groups)" statistic assumes a fixed effects model. For this example, which assumes random effects, use the "F (using group/subgroup msqr)" statistic, this treats the residual sum of squares as the samples sum of squares. For the null hypothesis of zero group variance, consider 2.5201/0.3288 (= 7.66 on an F(3,8) distribution) instead of 2.5201/0.0067 (= 379 on an F(3,12) distribution) because the point of randomization has been re-defined. The " F (VR between subgroups within groups" statistic clearly rejects the null hypothesis of zero subgroup-in-group variance.

The analysis shows that the plants contribute most to the overall variability and the leaves also have a statistically significant contribution. The samples from each leaf, as reflected by the residual sum of squares, contribute relatively little to the overall variability. Sampling further plants or leaves is, therefore, more important than taking multiple samples per leaf.

P values

analysis of variance