# Fully Nested Random Analysis of Variance

Menu location: Analysis_Analysis of Variance_Fully Nested.

This function calculates ANOVA for a fully nested random (hierarchical or split-plot) study design. One level of sub-grouping is supported and subgroups may be of unequal sizes. Corrected treatment and subgroup means are given.

You should seek expert statistical guidance before using this method.

If each treatment/exposure group in a study contains treatment/exposure sub-groups then data for nested analysis of variance may be set out as follows:

 Hospital 1 Hospital 2 ward 1 ward 2 ward 3 ward 1 ward 2 ward 3 x x x x x x <--- patients x x x x x x x x x x x x x x x x x x x x x x

The effects (treatments and their subgroups) in this type of study are often random but the same basic calculations are used for models with fixed effects or for mixed models. The variance ratios given are based on a fixed effects model but you can use the mean square results to calculate any other variance ratio of interest. A good account is given by Snedecor and Cochran (1989).

• For a fixed effects model use the "F (VR between groups)" statistic.
• For a random effects model use the "F (using group/subgroup msqr)" statistic.

Technical Validation

ANOVA for a three factor fully random nested (split-plot) model is calculated as follows (Snedecor and Cochran, 1989):

- where Xijk is the kth observation from the jth subgroup of the ith group, g is the number of groups, SStotal is the total sum of squares, SSgroups is the sum of squares due to the group factor, SSsubgroups (group i) is the sum of squares due to the subgroup factor of group i, si is the number of subgroups in the ith group, nij is the number of observations in the jth subgroup of the ith group and N is the total number of observations.

Hocking (1985) describes potential instabilities of this calculation, you should therefore seek expert statistical guidance before using it.

Example

Test workbook (ANOVA worksheet: P1L1, P1L2, P1L3, P2L1, P2L2, P2L3, P3L1, P3L2, P3L3, P4L1, P4L2, P4L3).

The following data represent calcium measurements from the leaves of turnip greens. The groups represent 4 plants and the subgroups represent 3 leaves taken from each plant. 2 samples were taken from each leaf for calcium measurement.

 Plant 1 leaf 1 leaf 2 leaf 3 x x x <--- sample x x x

To analyse these data in StatsDirect you must first enter them in the workbook using a separate column for each subgroup:

P = plant

L = leaf

 P1L1 P1L2 P1L3 P2L1 P2L2 P2L3 P3L1 P3L2 P3L3 P4L1 P4L2 P4L3 3.28 3.52 2.88 2.46 1.87 2.19 2.77 3.74 2.55 3.78 4.07 3.31 3.09 3.48 2.80 2.44 1.92 2.19 2.66 3.44 2.55 3.87 4.12 3.31

Alternatively, open the test workbook using the file open function of the file menu. Then select Fully Nested from the analysis of variance section of the analysis menu. Enter the number of groups as four and then select the four sets of subgroups marked "P1L1" (i.e. Plant 1 Leaf 1) etc.. Each subgroup should be selected by a single selection action.

For this example:

Fully nested/hierarchical random analysis of variance

Variables: (P1L1, P1L2, P1L3) (P2L1, P2L2, P2L3) (P3L1, P3L2, P3L3) (P4L1, P4L2, P4L3)

 Source of Variation Sum Squares DF Mean Square Between Groups 7.560346 3 2.520115 Between Subgroups within Groups 2.6302 8 0.328775 Residual 0.07985 12 0.006654 Total 10.270396 23

F (VR between groups) = 378.727406 P < .0001

F (using group/subgroup msqr) = 7.665167 P = .0097

F (VR between subgroups within groups) = 49.408892 P < .0001

The "F (VR between groups)" statistic assumes a fixed effects model. For this example, which assumes random effects, use the "F (using group/subgroup msqr)" statistic, this treats the residual sum of squares as the samples sum of squares. For the null hypothesis of zero group variance, consider 2.5201/0.3288 (= 7.66 on an F(3,8) distribution) instead of 2.5201/0.0067 (= 379 on an F(3,12) distribution) because the point of randomization has been re-defined. The " F (VR between subgroups within groups" statistic clearly rejects the null hypothesis of zero subgroup-in-group variance.

The analysis shows that the plants contribute most to the overall variability and the leaves also have a statistically significant contribution. The samples from each leaf, as reflected by the residual sum of squares, contribute relatively little to the overall variability. Sampling further plants or leaves is, therefore, more important than taking multiple samples per leaf.