***
*
PAGE RETIRED: Click here for the new StatsDirect help system.
*
OR YOU WILL BE REDIRECTED IN 5 SECONDS
*
***
Menu location: Analysis_Regression & Correlation_Conditional Logistic.
This function fits and analyses conditional logistic models for binary outcome/response data with one or more predictors, where observations are not independent but are matched or grouped in some way.
Binomial distributions are used for handling the errors associated with regression models for binary/dichotomous responses (i.e. yes/no, dead/alive) in the same way that the standard normal distribution is used in general linear regression. Other, less commonly used binomial models include normit/probit and complimentary loglog. The logistic model is widely used and has many desirable properties (Hosmer and Lemeshow, 1989; Armitage and Berry, 1994; Altman 1991; McCullagh and Nelder, 1989; Cox and Snell, 1989; Pregibon, 1981).
Odds = p/(1 p)
[p = proportional response, i.e. r out of n responded so p = r/n]
Logit = log odds = log(p /(1 p))
When a logistic regression model has been fitted, estimates of p are marked with a hat symbol above the Greek letter pi to denote that the proportion is estimated from the fitted regression model. Fitted proportional responses are often referred to as event probabilities (i.e. p hat n events out of n trials).
The following information about the difference between two logits demonstrates one of the important uses of logistic regression models:
Logistic models provide important information about the relationship between response/outcome and exposure. It makes no difference to logistic models, whether outcomes have been sampled prospectively or retrospectively, this is not the case with other binomial models.
The conditional logistic model can cope with 1:1 or 1:m casecontrol matching. In the simplest case, this is an extension of McNemar's test for matched studies.
Data preparation
You must prepare your data case by case, i.e. ungrouped, one subject/observation per row, this is unlike the unconditional logistic function that accepts grouped or ungrouped data.
The binary outcome variable must contain only 0 (control) or 1 (case).
There must be a stratum indicator variable to denote the strata. In casecontrol studies with 1:1 matching this would mean a code for each pair (i.e. two rows marked stratum x, one with a case + covariates and the other with a control + covariates). For 1:m matched studies there will be 1+m rows of data for each stratum/matchinggroup.
Technical validation
The regression is fitted by maximisation of the natural logarithm of the conditional likelihood function using NewtonRaphson iteration as described by Krailo et al. (1984), Smith et al. (1981) and Howard (1972).
Example
From Hosmer and Lemeshow (1989).
Test workbook (Regression worksheet: PAIRID, LBWT, RACE (b), SMOKE, HT, UI, PTD, LWT).
These are artificially matched data from a study of the risk factors associated with low birth weight in Massachusetts in 1986. The predictors studied here are black race (RACE (b)), smoking status (SMOKE), hypertension (HT), uterine irritability (UI), previous preterm delivery (PTD) and weight of the mother at her last menstrual period (LWT).
To analyse these data using StatsDirect you must first open the test workbook using the file open function of the file menu. Then select Conditional Logistic from the Regression and Correlation section of the analysis menu. Select the column marked "PAIRID" when asked for the stratum (match group) indicator. Then select "LBWT" when asked for the casecontrol indicator. Then select "RACE (b)", "SMOKE", "HT", "UI", "PTD", and "LWT" in one action when you are asked for predictors.
For this example:
Conditional logistic regression
Deviance (2 log likelihood) = 51.589852
Deviance (likelihood ratio) chisquare = 26.042632 P = 0.0002
Pseudo (McFadden) Rsquare = 0.33546
Label 
Parameter estimate 
Standard error 


RACE (b) 
0.582272 
0.620708 
z = 0.938078 
P = 0.3482 
SMOKE 
1.410799 
0.562177 
z = 2.509528 
P = 0.0121 
HT 
2.351335 
1.05135 
z = 2.236492 
P = 0.0253 
UI 
1.399261 
0.692244 
z = 2.021341 
P = 0.0432 
PTD 
1.807481 
0.788952 
z = 2.290989 
P = 0.022 
LWT 
0.018222 
0.00913 
z = 1.995807 
P = 0.046 
Label 
Odds ratio 
95% confidence interval 
RACE (b) 
1.790102 
0.53031 to 6.042622 
SMOKE 
4.099229 
1.361997 to 12.337527 
HT 
10.499579 
1.3374 to 82.429442 
UI 
4.052205 
1.043404 to 15.737307 
PTD 
6.095073 
1.298439 to 28.611218 
LWT 
0.981943 
0.964529 to 0.999673 
You may infer from the results above that hypertension, smoking status and previous preterm delivery are convincing predictors of low birth weight in the population studied.
Note that the selection of predictors for regression models such as this can be complex and is best done with the help of a Statistician. Hosmer and Lemeshow (1989) give a good discussion of the example above, but with nonstandard dummy variables (StatsDirect uses a standard dummy/design variable coding scheme adopted by most other statistical software). The optimal selection of predictors depends not only upon their numerical performance in the model, with or without appropriate transformations or study of interactions, but also upon their biophysical importance in the study.