Risk (Prospective)


Menu location: Analysis_Clinical Epidemiology_Risk (Prospective).


This function calculates relative risk, risk difference and population attributable risk difference with confidence intervals.


You can examine the risk of an outcome, such as disease, given the incidence of the outcome in relation to an exposure, such as a suspected risk or protection factor for a disease. The study design should be prospective. If you need information on retrospective studies see risk (retrospective).


The type of data used by this function is counts or frequencies (number of individuals with a study characteristic). If you want to analyse person-time data (e.g. months of follow up) instead of counts then please see incidence rates.


In studies of the incidence of a particular outcome in two groups of individuals, defined by the presence or absence of a particular characteristic, the odds ratio for the resultant fourfold table becomes the relative risk. Relative risk is used for prospective studies where you follow groups with different characteristics to observe whether or not a particular outcome occurs:


NO: c d


Outcome rate exposed (Pe) = a/(a+c)

Outcome rate not exposed (Pu) = b/(b+d)


Relative risk (RR) = Pe/Pu

Risk difference (RD) = Pe-Pu


Estimate of population exposure (Px) = (a+c)/(a+b+c+d)

Population attributable risk % = 100*(Px*(RR-1))/(1+(Px*(RR-1)))


In retrospective studies where you select subjects by outcome not by group characteristic then you would use the odds ratio ((a/c)/(b/d)) and not the relative risk. See risk (retrospective) for more information.


In addition to the relative measure of effect (relative risk) you may wish to express the absolute effect size in your study as the risk difference. Risk difference is sometimes referred to as attributable risk and when expressed in percent terms it is also referred to as attributable proportion, attributable rate percent and preventive fraction. Attributable risk or risk difference is used to quantify risk in the exposed group that is attributable to the exposure.


Population attributable risk estimates the proportion of disease in the study population that is attributable to the exposure. In order to calculate population attributable risk, the incidence of exposure in the study population must be known or estimated, StatsDirect prompts you to enter this value or to default to an estimate made from your study data. Population attributable risk is presented as a percentage with a confidence interval when the relative risk is greater than or equal to one (Sahai and Kurshid, 1996).


Technical validation

Koopman's likelihood-based approximation recommended by Gart and Nam is used to construct confidence intervals for relative risk (Gart and Nam, 1988; Koopman, 1984). Please note that relative risk, risk ratio and likelihood ratio are all calculations for ratios of binomial probabilities, therefore, the approach to confidence intervals is the same for each of them.


The confidence interval for risk difference is constructed using the robust approximation of Miettinen and Nurminen (Miettinen and Nurminen, 1985; Mee, 1984; Anbar, 1983; Gart and Nam, 1990; Newcombe, 1998b).


Approximate power is calculated as the power achieved with the given sample size to detect the observed effect with a two-sided probability of type I error of (100-CI%)% based on analysis with Fisher's exact test or a continuity corrected chi-square test of independence in a fourfold contingency table (Dupont, 1990).


Walter's approximate variance formula is used to construct the confidence interval for population attributable risk (Walter, 1978; Leung and Kupper, 1981).



From Sahai and Khurshid (1996, p. 208).


The following data are a subset of the Framingham study results showing the number of cases of coronary heart disease (CHD) becoming clinically apparent six years after follow up of a cohort of 1329 men in the 40 to 59 age group. The men are divided by their level of serum cholesterol (a suspected risk factor) at the start of the study:


  Cholesterol >=220 mg% Cholesterol < 220 mg%
CHD: 72 20
No CHD: 684 553


To analyse these data in StatsDirect select Risk (Prospective) from the Clinical Epidemiology of the Analysis menu. Choose the default 95% confidence interval. Then enter the above frequencies into the 2 by 2 table on the screen.


For this example:


Risk ratio (relative risk in incidence study) = 2.728571

Approximate (Koopman) 95% confidence interval = 1.694347 to 4.412075

Approximate power (for 5% significance) = 99.13%


Risk difference = 0.060334

Approximate (Miettinen) 95% confidence interval = 0.034379 to 0.086777


Population exposure % = 56.884876

Population attributable risk % = 49.578875

Approximate (Walter) 95% confidence interval = 30.469457 to 68.688294


Here we can say that the risk of CHD in men of this age is around two and a half times greater for those of them with serum cholesterol above 220 mg% compared with those with lower cholesterol levels. The confidence interval excludes one, indicating a significant result, and with 97.5% confidence we can say that this relative risk is at least 1.7 if the cohort is typical of men of this age in the wider population to which we are applying these results.


The population attributable risk estimates the proportion of disease (or other outcome) in the population that is attributable to the exposure. From these results we can say, with 95% confidence, that somewhere between 30% and 70% of the cases of CHD in 40 to 59 year old men are associated with high cholesterol (above 220 mg%).


confidence intervals