Copyright © 1990-2008 StatsDirect Limited, all rights reserved

Download a free trial of StatsDirect

2 by k chi-square test

 

Menu location: Analysis_Chi-Square_2 by k.

 

Several proportions can be compared using a 2 by k chi-square test. For example, a random sample of people can be subdivided into k age groups and counts made of those individuals with and those without a particular attribute. For this sample, a 2 by k chi-square test could be used to test whether or not age has a statistically significant effect on the attribute studied. This is a test of the independence of the row and column variables, it is equivalent to the chi-square independence tests for 2 by 2 and r by c chi-square tables.

image\STAT0052_wmf.gif

- where, for r rows and c columns of n observations, O is an observed frequency and E is an estimated expected frequency. The expected frequency for any cell is estimated as the row total times the column total then divided by the grand total (n).

 

Assumptions of the tests of independence:

 

Note that an exact test of independence is provided in the r by c table analysis function; you should use this instead of the chi-square statistic when you have small numbers (say expected frequency less than 5) in any of the table cells.

 

If there is a meaningful order to your k groups (e.g. sequential age bands) then the chi-square test for trend provides a more powerful test than the unordered independence test above. StatsDirect automatically performs a test for linear trend across the k groups. You can enter your own scores for the trend test. For example, if a variable was categorised as mild, moderate or severe pain then scores 1, 2 and 3 are likely to be a reasonable, so leave StatsDirect to assign scores. If, instead, the categories were mild, moderate and worst ever pain then you might enter a linear score system as 1, 2 and 5 respectively (Armitage and Berry, 1994; Altman, 1991).

 

image\STAT0053_wmf.gif

image\STAT0054_wmf.gif

- where each of k groups of observations are denoted as ri successes out of ni total with score vi assigned. R is the sum of all ri, N is the sum of all ni and p = R/N.

 

Should you wish to investigate your 2 by k table further then the r by c chi-square test provides a more detailed analysis. Please note that the linear trend analysis may differ slightly between the 2 by k and r by c chi-square tests, this is because the r by c linear trend analysis is not calculated as above but instead considers trend in both dimensions of the table (closely related to Pearson's correlation).

 

Example

From Armitage and Berry (1994).

 

The following data describe numbers of children with different sized palatine tonsils and their carrier status for Strep. pyogenes.

 

 

Tonsils

 

not enlarged

Enlarged

Enlarged greatly

Carriers

19

29

24

Non-carriers

497

560

269

 

To analyse these data in StatsDirect you must select 2 by k (scores 1 to k) from the chi-square section of the analysis menu. Then select the middle option from the 2 by k chi-square test menu. Choose the default 95% confidence interval. Then select the number of rows as 3. You then enter the above data as directed by the screen. Use carriers as successes and non-carriers as failures.

 

For this example:

 

 

Successes

Failures

Total

Per cent

Observed

19

497

516

3.682171

Expected

26.57511

489.4249

 

 

Observed

29

560

589

4.923599

Expected

30.33476

558.6652

 

 

Observed

24

269

293

8.191126

Expected

15.09013

277.9099

 

 

Total

72

1326

1398

5.150215

 

Total Chi² = 7.884843 |Chi| = 2.807996, (2 DF), P = .0194

 

Chi² for linear trend = 7.19275 |Chi| = 2.68193, (1 DF), P = .0073

 

Remaining Chi² (non-linearity) = 0.692093, (1 DF), P = .4055

 

Here the total chi-square test shows a statistically significant association between the classifications, i.e. between tonsil size and Strep. pyogenes carrier status. We have also shown a significant linear trend which enables us to refine our conclusions to a suggestion that the proportion of Strep. pyogenes carriers increases with tonsil size.

 

P values

confidence intervals