***
*
PAGE RETIRED: Click here for the new StatsDirect help system.
*
OR YOU WILL BE REDIRECTED IN 5 SECONDS
*
***
Menu location: Analysis_Survival_Cox Regression.
This function fits Cox's proportional hazards model for survivaltime (timetoevent) outcomes on one or more predictors.
Cox regression (or proportional hazards regression) is method for investigating the effect of several variables upon the time a specified event takes to happen. In the context of an outcome such as death this is known as Cox regression for survival analysis. The method does not assume any particular "survival model" but it is not truly nonparametric because it does assume that the effects of the predictor variables upon survival are constant over time and are additive in one scale. You should not use Cox regression without the guidance of a Statistician.
Provided that the assumptions of Cox regression are met, this function will provide better estimates of survival probabilities and cumulative hazard than those provided by the KaplanMeier function.
Hazard and hazardratios
Cumulative hazard at a time t is the risk of dying between time 0 and time t, and the survivor function at time t is the probability of surviving to time t (see also KaplanMeier estimates).
The coefficients in a Cox regression relate to hazard; a positive coefficient indicates a worse prognosis and a negative coefficient indicates a protective effect of the variable with which it is associated.
The hazards ratio associated with a predictor variable is given by the exponent of its coefficient; this is given with a confidence interval under the "coefficient details" option in StatsDirect. The hazards ratio may also be thought of as the relative death rate, see Armitage and Berry (1994). The interpretation of the hazards ratio depends upon the measurement scale of the predictor variable in question, see Sahai and Kurshid (1996) for further information on relative risk of hazards.
Timedependent and fixed covariates
In prospective studies, when individuals are followed over time, the values of covariates may change with time. Covariates can thus be divided into fixed and timedependent. A covariate is time dependent if the difference between its values for two different subjects changes with time; e.g. serum cholesterol. A covariate is fixed if its values can not change with time, e.g. sex or race. Lifestyle factors and physiological measurements such as blood pressure are usually timedependent. Cumulative exposures such as smoking are also timedependent but are often forced into an imprecise dichotomy, i.e. "exposed" vs. "notexposed" instead of the more meaningful "time of exposure". There are no hard and fast rules about the handling of time dependent covariates. If you are considering using Cox regression you should seek the help of a Statistician, preferably at the design stage of the investigation.
Model analysis and deviance
A test of the overall statistical significance of the model is given under the "model analysis" option. Here the likelihood chisquare statistic is calculated by comparing the deviance ( 2 * log likelihood) of your model, with all of the covariates you have specified, against the model with all covariates dropped. The individual contribution of covariates to the model can be assessed from the significance test given with each coefficient in the main output; this assumes a reasonably large sample size.
Deviance is minus twice the log of the likelihood ratio for models fitted by maximum likelihood (Hosmer and Lemeshow, 1989 and 1999; Cox and Snell, 1989; Pregibon, 1981). The value of adding a parameter to a Cox model is tested by subtracting the deviance of the model with the new parameter from the deviance of the model without the new parameter, the difference is then tested against a chisquare distribution with degrees of freedom equal to the difference between the degrees of freedom of the old and new models. The model analysis option tests the model you specify against a model with only one parameter, the intercept; this tests the combined value of the specified predictors/covariates in the model.
Some statistical packages offer stepwise Cox regression that performs systematic tests for different combinations of predictors/covariates. Automatic model building procedures such as these can be misleading as they do not consider the realworld importance of each predictor, for this reason StatsDirect does not include stepwise selection.
Survival and cumulative hazard rates
The survival/survivorship function and the cumulative hazard function (as discussed under KaplanMeier) are calculated relative to the baseline (lowest value of covariates) at each time point. Cox regression provides a better estimate of these functions than the KaplanMeier method when the assumptions of the Cox model are met and the fit of the model is strong.
You are given the option to 'centre continuous covariates' – this makes survival and hazard functions relative to the mean of continuous variables rather than relative to the minimum, which is usually the most meaningful comparison.
If you have binary/dichotomous predictors in your model you are given the option to calculate survival and cumulative hazards for each variable separately.
Data preparation
Timetoevent, e.g. time a subject in a trial survived.
Event / censor code  this must be ³1 (event(s) happened) or 0 (no event at the end of the study, i.e. "right censored").
Strata  e.g. centre code for a multicentre trial. Be careful with your choice of strata; seek the advice of a Statistician.
Predictors  these are also referred to as covariates, which can be a number of variables that are thought to be related to the event under study. If a predictor is a classifier variable with more than two classes (i.e. ordinal or nominal) then you must first use the dummy variable function to convert it to a series of binary classes.
Technical validation
StatsDirect optimises the log likelihood associated with a Cox regression model until the change in log likelihood with iterations is less than the accuracy that you specify in the dialog box that is displayed just before the calculation takes place (Lawless, 1982; Kalbfleisch and Prentice, 1980; Harris, 1991; Cox and Oakes, 1984; Le, 1997; Hosmer and Lemeshow, 1999).
The calculation options dialog box sets a value (default is 10000) for "SPLITTING RATIO"; this is the ratio in proportionality constant at a time t above which StatsDirect will split your data into more strata and calculate an extended likelihood solution, see Bryson and Johnson, (1981).
Ties are handled by Breslow's approximation (Breslow, 1974).
CoxSnell residuals are calculated as specified by Cox and Oakes (1984). CoxSnell, Martingale and deviance residuals are calculated as specified by Collett (1994).
Baseline survival and cumulative hazard rates are calculated at each time. Maximum likelihood methods are used, which are iterative when there is more than one death/event at an observed time (Kalbfleisch and Prentice, 1973). Other software may use the less precise Breslow estimates for these functions.
Example
From Armitage and Berry (1994, p. 479).
Test workbook (Survival worksheet: Stage Group, Time, Censor).
The following data represent the survival in days since entry to the trial of patients with diffuse histiocytic lymphoma. Two different groups of patients, those with stage III and those with stage IV disease, are compared.
Stage 3: 6, 19, 32, 42, 42, 43*, 94, 126*, 169*, 207, 211*, 227*, 253, 255*, 270*, 310*, 316*, 335*, 346*
Stage 4: 4, 6, 10, 11, 11, 11, 13, 17, 20, 20, 21, 22, 24, 24, 29, 30, 30, 31, 33, 34, 35, 39, 40, 41*, 43*, 45, 46, 50, 56, 61*, 61*, 63, 68, 82, 85, 88, 89, 90, 93, 104, 110, 134, 137, 160*, 169, 171, 173, 175, 184, 201, 222, 235*, 247*, 260*, 284*, 290*, 291*, 302*, 304*, 341*, 345*
* = censored data (patient still alive or died from an unrelated cause)
To analyse these data in StatsDirect you must first prepare them in three workbook columns as shown below:
Stage group 
Time 
Censor 
1 
6 
1 
1 
19 
1 
1 
32 
1 
1 
42 
1 
1 
42 
1 
1 
43 
0 
1 
94 
1 
1 
126 
0 
1 
169 
0 
1 
207 
1 
1 
211 
0 
1 
227 
0 
1 
253 
1 
1 
255 
0 
1 
270 
0 
1 
310 
0 
1 
316 
0 
1 
335 
0 
1 
346 
0 
2 
4 
1 
2 
6 
1 
2 
10 
1 
2 
11 
1 
2 
11 
1 
2 
11 
1 
2 
13 
1 
2 
17 
1 
2 
20 
1 
2 
20 
1 
2 
21 
1 
2 
22 
1 
2 
24 
1 
2 
24 
1 
2 
29 
1 
2 
30 
1 
2 
30 
1 
2 
31 
1 
2 
33 
1 
2 
34 
1 
2 
35 
1 
2 
39 
1 
2 
40 
1 
2 
41 
0 
2 
43 
0 
2 
45 
1 
2 
46 
1 
2 
50 
1 
2 
56 
1 
2 
61 
0 
2 
61 
0 
2 
63 
1 
2 
68 
1 
2 
82 
1 
2 
85 
1 
2 
88 
1 
2 
89 
1 
2 
90 
1 
2 
93 
1 
2 
104 
1 
2 
110 
1 
2 
134 
1 
2 
137 
1 
2 
160 
0 
2 
169 
1 
2 
171 
1 
2 
173 
1 
2 
175 
1 
2 
184 
1 
2 
201 
1 
2 
222 
1 
2 
235 
0 
2 
247 
0 
2 
260 
0 
2 
284 
0 
2 
290 
0 
2 
291 
0 
2 
302 
0 
2 
304 
0 
2 
341 
0 
2 
345 
0 
Alternatively, open the test workbook using the file open function of the file menu. Then select Cox regression from the survival analysis section of the analysis menu. Select the column marked "Time" when asked for the times, select "Censor" when asked for death/ censorship, click on the cancel button when asked about strata and when asked about predictors and select the column marked "Stage group".
For this example:
Cox (proportional hazards) regression
80 subjects with 54 events
Deviance (likelihood ratio) chisquare = 7.634383 df = 1 P = 0.0057
Stage group b1 = 0.96102 z = 2.492043 P = 0.0127
Cox regression  hazard ratios
Parameter 
Hazard ratio 
95% CI 
Stage group 
2.614362 
1.227756 to 5.566976 



Parameter 
Coefficient 
Standard Error 
Stage group 
0.96102 
0.385636 
Cox regression  model analysis
Log likelihood with no covariates = 207.554801
Log likelihood with all model covariates = 203.737609
Deviance (likelihood ratio) chisquare = 7.634383 df = 1 P = 0.0057
The significance test for the coefficient b1 tests the null hypothesis that it equals zero and thus that its exponent equals one. The confidence interval for exp(b1) is therefore the confidence interval for the relative death rate or hazard ratio; we may therefore infer with 95% confidence that the death rate from stage 4 cancers is approximately 3 times, and at least 1.2 times, the risk from stage 3 cancers.