***
*
PAGE RETIRED: Click here for the new StatsDirect help system.
*
OR YOU WILL BE REDIRECTED IN 5 SECONDS
*
***
Menu location: Analysis_ChiSquare_r by c.
The r by c chisquare test in StatsDirect uses a number of methods to investigate two way contingency tables that consist of any number of independent categories forming r rows and c columns.
Tests of independence of the categories in a table are the chisquare test, the Gsquare (likelihoodratio chisquare) test and the generalised Fisher exact (FisherFreemanHalton) test. All three tests indicate the degree of independence between the variables that make up the table.
The generalised Fisher exact test is difficult to compute (Mehta and Patel, 1983, 1986a); it may take a long time and it may not be computed for the table that you enter. If the Fisher exact method cannot be computed practically then a hybrid method based upon Cochrane rules is used (Mehta and Patel, 1986b); this may also fail with large tables and/or numbers. The FisherFreemanHalton result is quoted with just one P value as it is implicitly twosided.
Relating the FisherFreemanHalton statistic to the Pearson Chisquare statistic:
The null hypothesis is independence between row and column categories.
Let t denote a table from the set of all tables with the same row and column margins.
Let D(t) be the measure of discrepancy.
The exact two sided P value = P [D(t) >= D(t observed)] = sum of hypergeometric probabilities of those tables where D(t) is larger than or equal to the observed table.
In large samples the distribution of D(t) conditional on fixed row and column margins converges to the chisquare distribution with (r1)(c1) degrees of freedom.
The Gsquare statistic is less reliable than the chisquare statistic when you have small numbers. In general, you should use the chisquare statistic if the Fisher exact test is not computable. If you consult a statistician then it would be useful to provide the Gsquare statistic also.
These tests of independence are suitable for nominal data. If your data are ordinal then you should use the more powerful tests for trend (Armitage and Berry, 1994; Agresti, 2002, 1996).
Assumptions of the tests of independence:
the sample is random
each observation may be classified into one cell (in the table) only
 where, for r rows and c columns of n observations, O is an observed frequency and E is an estimated expected frequency. The expected frequency for any cell is estimated as the row total times the column total then divided by the grand total (n).
 where P is the two sided Fisher probability, Pf is the conditional probability for the observed table given fixed row and column totals (fi. and f.j respectively), f.. is the total count and ! represents factorial.
Analysis of trend in r by c tables indicates how much of the general independence between scores is accounted for by linear trend. StatsDirect uses equally spaced scores for this purpose unless you specify otherwise. If you wish to experiment with other scoring systems then expert statistical guidance is advisable. Armitage and Berry (1994) quote an example where extent of grief of mothers suffering a perinatal death, graded I to IV, is compared with the degree of support received by these women. In this example the overall statistic is nonsignificant but a significant trend is demonstrated.
 where, for r rows and c columns of n observations, O is an observed frequency and E is an estimated expected frequency. The expected frequency for any cell is estimated as the row total times the column total then divided by the grand total (n). Row scores are u, column scores are v, row totals are Oj+ and column totals are Oi+.
The sample correlation coefficient r reflects the direction and closeness of linear trend in your table. r may vary between 1 and 1 just like Pearson's product moment correlation coefficient. Total independence of the categories in your table would mean that r = 0. The test for linear trend is related to r by M²=(n1)r² and this is numerically identical to Armitage's chisquare for linear trend (Armitage and Berry, 1994; Agresti, 1996). If you interchange the rows and columns in your table then the value of M² will be the same
The ANOVA output applies techniques similar to analysis of variance to an r by c table. Here the equality of mean column and row scores is tested. StatsDirect uses equally spaced scores for this purpose unless you specify otherwise. See Armitage for more information (Armitage and Berry, 1994).
Pearson's and Cramér's (V) coefficients of contingency and the phi (f, correlation) coefficient reflect the strength of the association in a contingency table (Agresti, 1996; Fleiss, 1981; Stuart and Ord, 1994):
For 2 by 2 tables, Cramér's V is calculated alternatively as a signed value:
Observed values, expected values and totals are given for the table when c £ 8 and r £ 10.
If your data categories are both ordered then you will gain more power in tests of independence by using the ordinal methods due to Goodman and Kruskal (gamma) and Kendall (taub). Large sample, asymptotically normal variance estimates are used; the simple form is used for independence testing (Agresti, 1984; Conover, 1999; Goodman and Kruskal, 1963, 1972). Taub tends to be less sensitive than gamma to the choice of response categories.
Example
From Armitage and Berry (1994, p. 408).
The following data (as above) describe the state of grief of 66 mothers who had suffered a neonatal death. The table relates this to the amount of support given to these women:


Support: 



Good 
Adequate 
Poor 
Grief State: 
I 
17 
9 
8 
II 
6 
5 
1 

III 
3 
5 
4 

IV 
1 
2 
5 
To analyse these data in StatsDirect you must select r by c from the chisquare section of the analysis menu. Choose the default 95% confidence interval. Then enter the above data as directed by the screen.
For this example:
Observed 
17 
9 
8 
34 
Expected 
13.91 
10.82 
9.27 

DChi² 
0.69 
0.31 
0.17 

Observed 
6 
5 
1 
12 
Expected 
4.91 
3.82 
3.27 

DChi² 
0.24 
0.37 
1.58 

Observed 
3 
5 
4 
12 
Expected 
4.91 
3.82 
3.27 

DChi² 
0.74 
0.37 
0.16 

Observed 
1 
2 
5 
8 
Expected 
3.27 
2.55 
2.18 

DChi² 
1.58 
0.12 
3.64 

Totals: 
27 
21 
18 
66 
TOTAL number of cells = 12
WARNING: 9 out of 12 cells have 1 £ EXPECTATION < 5
NOMINAL INDEPENDENCE
Chisquare = 9.9588, DF = 6, P = 0.1264
Gsquare = 10.186039, DF = 6, P = 0.117
FisherFreemanHalton exact P = 0.1426
ANOVA
Chisquare for equality of mean column scores = 5.696401
DF = 2, P = 0.0579
LINEAR TREND
Sample correlation (r) = 0.295083
Chisquare for linear trend (M²) = 5.6598
DF = 1, P = 0.0174
NOMINAL ASSOCIATION
Phi = 0.388447
Pearson's contingency = 0.362088
Cramér's V = 0.274673
ORDINAL
GoodmanKruskal gamma = 0.349223
Approximate test of gamma = 0: SE = 0.15333, P = 0.0228, 95% CI = 0.048701 to 0.649744
Approximate test of independence: SE = 0.163609, P = 0.0328, 95% CI = 0.028554 to 0.669891
Kendall taub = 0.236078
Approximate test of taub = 0: SE = 0.108929, P = 0.0302, 95% CI = 0.02258 to 0.449575
Approximate test of independence: SE = 0.110601, P = 0.0328, 95% CI = 0.019303 to 0.452852
Here we see that although the overall test was not significant we did show a statistically significant trend in mean scores. This suggests that supporting these mothers did help lessen their burden of grief.