# ROC Curve Analysis

This plots a Receiver Operating Characteristic (ROC) curve from two sets of raw data.

ROC plots were first used to define detection cut-off points for radar equipment with different operators. These plots can be used in a similar way to define cut-off points for diagnostic tests, for example the level of prostate specific antigen in a blood sample indicating a diagnosis of prostatic carcinoma. Defining cut-off levels for diagnostic tests is a difficult process which should combine ethical and practical considerations with numerical evidence. It is wise to involve a statistician in studies of new diagnostic tests (Altman, 1991).

StatsDirect requires two columns of data for each ROC plot, one with test results in cases where the condition tested for is known to be present and another for test results in known negative cases. Sensitivity (probability of +ve test when disease is present) is then plotted against 1-specificity (probability of +ve test when disease is absent). See diagnostic test for more information.

When you have a number of ROC curves to compare, the area under the curve is usually the best discriminator (Metz, 1978).

StatsDirect calculates the area under the ROC curve directly by an extended trapezoidal rule (Press et al. 1992) and by a nonparametric method analogous to the Wilcoxon/Mann-Whitney test (Hanley and McNeil 1982). A confidence interval is constructed using DeLong’s variance estimate (DeLong et al, 1988).

Example

From Aziz et al. (1996).

Test workbook (SDI (conceived), SDI (not conceived)).

The following are Sperm Deformity Index (SDI) values from semen samples of men in an infertility study. They are divided into a "condition" present group defined as those whose partners achieved pregnancy and "condition" absent where there was no pregnancy.

SDI (conceived)

165, 140, 154, 139, 134, 154, 120, 133, 150, 146, 140, 114, 128, 131, 116, 128, 122, 129, 145, 117, 140, 149, 116, 147, 125, 149, 129, 157, 144, 123, 107, 129, 152, 164, 134, 120, 148, 151, 149, 138, 159, 169, 137, 151, 141, 145, 135, 135, 153, 125, 159, 148, 142, 130, 111, 140, 136, 142, 139, 137, 187, 154, 151, 149, 148, 157, 159, 143, 124, 141, 114, 136, 110, 129, 145, 132, 125, 149, 146, 138, 151, 147, 154, 147, 158, 156, 156, 128, 151, 138, 193, 131, 127, 129, 120, 159, 147, 159, 156, 143, 149, 160, 126, 136, 150, 136, 151, 140, 145, 140, 134, 140, 138, 144, 140, 140

SDI (not conceived)

159, 136, 149, 156, 191, 169, 194, 182, 163, 152, 145, 176, 122, 141, 172, 162, 165, 184, 239, 178, 178, 164, 185, 154, 164, 140, 207, 214, 165, 183, 218, 142, 161, 168, 181, 162, 166, 150, 205, 163, 166, 176

To analyse these data using StatsDirect you must first enter them into two columns in a workbook. Enter the number of plots as 1. Then select ROC from the graphics menu and select the appropriate columns for condition present and absent from the workbook. Leave the weighting option as 1 and leave the cut-off calculator as checked. You are then presented with the cut-off calculator, try pressing the up and down arrow keys to display diagnostic test statistics for different cut-offs. Then press "Reset" and "Ok". The ROC plot is then drawn with the optimised cut-off point marked. The plot should look like a stepped curve convex to the top left hand corner, if it is upside down then you have probably selected "condition present" and "condition absent" the wrong way around.

For this example:

The optimised cut-off for equally important sensitivity and specificity was calculated at 160 with these data. A cut-off of 161 was gained with sensitivity weighted twice as important as specificity. After a similar analysis of a larger study > 160 was subsequently chosen as the SDI level for selecting patients for a type of infertility treatment.

ROC Analysis

Data set: SDI(+ve), SDI(-ve)

Area under ROC curve by extended trapezoidal rule = 0.875411

Wilcoxon estimate of area under ROC curve = 0.875411

DeLong standard error = 0.034862: 95% CI = 0.807082 to 0.943739

Optimum cut-off point selected = 160.064

 Table at cut-off: a b 30 5 c d 12 111

sensitivity (95% CI) = 0.714286 (0.554161 to 0.842809)

specificity (95% CI) = 0.956897 (0.902275 to 0.985858)