- •Contents
- •Preface to the 2nd Edition
- •Preface to the 1st Edition
- •Introduction
- •Learning Objectives
- •Variables and Data
- •The good, the Bad, and the Ugly – Types of Variable
- •Categorical Variables
- •Metric Variables
- •How can I Tell what Type of Variable I am Dealing with?
- •2 Describing Data with Tables
- •Learning Objectives
- •What is Descriptive Statistics?
- •The Frequency Table
- •3 Describing Data with Charts
- •Learning Objectives
- •Picture it!
- •Charting Nominal and Ordinal Data
- •Charting Discrete Metric Data
- •Charting Continuous Metric Data
- •Charting Cumulative Data
- •4 Describing Data from its Shape
- •Learning Objectives
- •The Shape of Things to Come
- •5 Describing Data with Numeric Summary Values
- •Learning Objectives
- •Numbers R us
- •Summary Measures of Location
- •Summary Measures of Spread
- •Standard Deviation and the Normal Distribution
- •Learning Objectives
- •Hey ho! Hey ho! It’s Off to Work we Go
- •Collecting the Data – Types of Sample
- •Types of Study
- •Confounding
- •Matching
- •Comparing Cohort and Case-Control Designs
- •Getting Stuck in – Experimental Studies
- •7 From Samples to Populations – Making Inferences
- •Learning Objectives
- •Statistical Inference
- •8 Probability, Risk and Odds
- •Learning Objectives
- •Calculating Probability
- •Probability and the Normal Distribution
- •Risk
- •Odds
- •Why you can’t Calculate Risk in a Case-Control Study
- •The Link between Probability and Odds
- •The Risk Ratio
- •The Odds Ratio
- •Number Needed to Treat (NNT)
- •Learning Objectives
- •Estimating a Confidence Interval for the Median of a Single Population
- •10 Estimating the Difference between Two Population Parameters
- •Learning Objectives
- •What’s the Difference?
- •Estimating the Difference between the Means of Two Independent Populations – Using a Method Based on the Two-Sample t Test
- •Estimating the Difference between Two Matched Population Means – Using a Method Based on the Matched-Pairs t Test
- •Estimating the Difference between Two Independent Population Proportions
- •Estimating the Difference between Two Independent Population Medians – The Mann–Whitney Rank-Sums Method
- •Estimating the Difference between Two Matched Population Medians – Wilcoxon Signed-Ranks Method
- •11 Estimating the Ratio of Two Population Parameters
- •Learning Objectives
- •12 Testing Hypotheses about the Difference between Two Population Parameters
- •Learning Objectives
- •The Research Question and the Hypothesis Test
- •A Brief Summary of a Few of the Commonest Tests
- •Some Examples of Hypothesis Tests from Practice
- •Confidence Intervals Versus Hypothesis Testing
- •Nobody’s Perfect – Types of Error
- •The Power of a Test
- •Maximising Power – Calculating Sample Size
- •Rules of Thumb
- •13 Testing Hypotheses About the Ratio of Two Population Parameters
- •Learning Objectives
- •Testing the Risk Ratio
- •Testing the Odds Ratio
- •Learning Objectives
- •15 Measuring the Association between Two Variables
- •Learning Objectives
- •Association
- •The Correlation Coefficient
- •16 Measuring Agreement
- •Learning Objectives
- •To Agree or not Agree: That is the Question
- •Cohen’s Kappa
- •Measuring Agreement with Ordinal Data – Weighted Kappa
- •Measuring the Agreement between Two Metric Continuous Variables
- •17 Straight Line Models: Linear Regression
- •Learning Objectives
- •Health Warning!
- •Relationship and Association
- •The Linear Regression Model
- •Model Building and Variable Selection
- •18 Curvy Models: Logistic Regression
- •Learning Objectives
- •A Second Health Warning!
- •Binary Dependent Variables
- •The Logistic Regression Model
- •19 Measuring Survival
- •Learning Objectives
- •Introduction
- •Calculating Survival Probabilities and the Proportion Surviving: the Kaplan-Meier Table
- •The Kaplan-Meier Chart
- •Determining Median Survival Time
- •Comparing Survival with Two Groups
- •20 Systematic Review and Meta-Analysis
- •Learning Objectives
- •Introduction
- •Systematic Review
- •Publication and other Biases
- •The Funnel Plot
- •Combining the Studies
- •Solutions to Exercises
- •References
- •Index
|
THE ODDS RATIO |
105 |
|||
Table 8.5 Generalised 2 × 2 table for odds ratio calculations in a |
|||||
case-control study |
|
|
|
|
|
|
|
|
|
|
|
|
|
Group by outcome (e.g. disease) |
|||
|
|
|
|
|
|
|
|
Cases |
Controls |
||
|
|
|
|
|
|
Exposed to risk factor? |
Yes |
a |
b |
||
|
No |
c |
d |
||
|
|
|
|
|
|
The odds ratio
With a case-control study you can compare the odds that those with a disease will have been exposed to the risk factor, with the odds that those who don’t have the disease will have been exposed. If you divide the former by the latter you get the odds ratio.
On p. 102 you calculated the following odds for the stroke and exercise study (where we are treating exercise as the risk factor): the odds that those with a stroke had exercised = 55/70 = 0.7857; and the odds that those without a stroke had exercised = 130/68 = 1.9118. Diving the former by the latter, you get the odds ratio = 0.7857/1.9118 = 0.4110. This result suggests that those with a stroke are less than half as likely to have exercised when young as the healthy controls. It would seem that exercise is a beneficial ‘risk’ factor. We can generalise the odds ratio calculation with the help of the 2 × 2 table in Table 8.5.
The odds of exposure to the risk factor among those with the disease = a/c,
The odds of exposure to the risk factor among the healthy controls = b/d.
a/c |
= ad/bc . |
Therefore: odds ratio = b/d |
Exercise 8.8 Use the results from Exercise 8.5 to calculate the odds ratio for smoking among the mothers of Down syndrome babies compared to mothers of healthy babies, for: (a) mothers aged under 35; (b) mothers aged 35 and over. Interpret your results.
Remember that the risk ratios and odds ratios in the coronary heart disease and in the stroke examples above are sample risk and odds ratios. For instance, from the sample risk ratio of 1.928 in the CHD/weight at one year study, you can infer that the population risk ratio is also about 1.93 ± a ‘bit’. But how big is this ‘bit’, how precise is your estimate? This is a question I’ll address in Chapter 11.
Finally, I mentioned earlier that most people are happier with the concept of ‘risk’ than with ‘odds’, but that you can’t calculate risk in a case-control study. However, there is a happy ending. The odds ratio in a case-control study is a reasonably good estimator of the
106 |
CH 8 PROBABILITY, RISK AND ODDS |
equivalent risk ratio, so you can at least approximate its value with the corresponding odds ratio.
Number needed to treat (NNT)
This seems as good a time as any to discuss a measure of the effectiveness of a clinical procedure which is related to risk; more precisely, to absolute risk. This is the number needed to treat, or NNT. NNT is the number of patients who would need to be treated with the active procedure, rather than a placebo (or alternative procedure), in order to reduce by one the number of patients experiencing the condition.
To explain NNT let’s go back to the example for weighing 18 lbs or less at one year as a risk factor for coronary heart disease (CHD). The absolute risk of CHD among those weighing 18 lbs or less was 0.2667. The absolute risk of CHD for those weighing more than 18 lbs was 0.1382.
We need now to define the absolute risk reduction or ARR as the difference between two absolute risks. So in this example, the absolute risk reduction is the difference in these two absolute risks – the reduction in risk gained by weighing more than 18 lbs at one year rather than weighing 18 lbs or less. In this case:
ARR = 0.2667 − 0.1382 = 0.1285
Now the number needed to treat is defined as follows: NNT = 1/ARR Thus in this case: NNT = 1/0.1285 = 7.78
In other words, if you had some treatment (infant-care advice for vulnerable parents, for example), which would cause infants who would otherwise have weighed less than 18 lbs at one year to weigh 18 lbs or more, then you would need to ‘treat’ eight infants (or their parents) to ensure that one of these infants did not develop coronary heart disease when an adult.2 NNT is often used to give a familiar and practical meaning to outcomes from clinical trials and systematic reviews,3 where measures of risk, and risk ratios, may be difficult to translate into the potential benefit to patients.
An example from practice
Table 8.6 is from the follow-up (cohort) study into the effectiveness of carotid endarterectomy in ipsilateral stroke prevention first referred to in Figure 3.2 (Inzitari et al. 2000). The table shows that for any stroke, the (absolute) risk if treated medically is 0.110 (11.0 per cent), and if treated surgically is 0.051 (5.1 per cent). The reduction in absolute risk, ARR = 0.110 - 0.051 = 0.059 (5.9 per cent). So NNT = 1/0.059 = 16.95 or 17, at five years. In other words, 17 patients would have to be treated with carotid endarterectomy to prevent one patient from having a stroke within five years who, without the treatment, would otherwise have done so.
2The number must always be rounded up.
3Systematic review is the systematic collection of all the results from as many similarly-designed studies as possible dealing with the same clinical problem. I discuss this procedure in Chapter 20.
NUMBER NEEDED TO TREAT (NNT) |
107 |
Table 8.6 Example of numbers needed to treat (NNT), at five years and two years from a follow-up (cohort) study into the effectiveness of carotid endarterectomy in stroke prevention. Reproduced from NEJM, 342, 1693–9, by permission of Massachusetts Medical Society
|
|
|
|
Absolute |
|
|
|
Medically |
Surgically |
Reduction |
Difference |
No. Needed |
|
Cause |
Treated Group |
Treated Group |
in Risk |
in Risk |
to Treat* |
|
|
|
|
|
|
at 5 yr |
at 2 yr |
|
|
|
|
|
|
|
Any stroke† |
11.0 |
5.1 |
54 |
5.9 |
17 |
67 |
Large-artery stroke‡ |
6.6 |
3.1 |
54 |
3.5 |
29 |
111 |
|
|
|
|
|
|
|
*The number needed to treat is calculated as the reciprocal of the difference in risk. At two years, the number needed to treat is based on estimated differences in risk of 1.5 percent for stroke of any cause and 0.9 percent for large-artery stroke.
†The risk of stroke from any cause in the medical and surgical groups in the Asymptomatic Carotid Atherosclerosis Study is shown.
‡The estimates of the risk of large-artery stroke were based on the observations that for subjects in the NASCET with 60 to 99 percent stenosis, the ratio of the risk of large-artery stroke to the risk of stroke from any cause in the territory of a symptomatic artery was similar in the medically and surgically treated subjects, and the risk of large-artery stroke was approximately 60 percent of the risk of stroke from any cause in the territory of an asymptomatic artery (i.e., 6.6 percent = 60 percent of 11.0 percent, and 3.1 percent = 60 percent of 5.1 percent).
Exercise 8.9 In a cohort study of a possible connection between dental disease and coronary heart disease (CHD), subjects were tracked for 14 years (deStefano et al.). Of 3542 subjects with no dental disease, 92 died from CHD, while of 1786 subjects with periodontitis, 151 died from CHD. How many people must be successfully treated for periodontitis to prevent one person dying from CHD?
V
The Informed Guess –
Confidence Interval Estimation
