Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Using and Understanding Medical Statistics_Matthews, Farewell_2007

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
3.03 Mб
Скачать

ture to convince researchers that inadequacies in the pathological, clinical or co-morbid components of the spectrum of diseased and disease-free subjects used to validate a newly-developed diagnostic tool should prompt clinicians to be cautious about apparently promising tools that have not been adequately scrutinized. One such example is a carcinoembryonic antigen (CEA) test for colon cancer. Initial studies reported high sensitivity and specificity – each in excess of 0.90 – but this was apparently due to their having been estimated in patients with extensive disease. For patients with localized disease, the sensitivity of the CEA test was eventually shown/estimated to be as low as 0.37.

Unfortunately, the problems identified by Ransohoff and Feinstein nearly 30 years ago continue to be repeated, prompting Reid et al. [76] to conclude in 1995 that ‘most diagnostic tests are still inadequately appraised’. In a review of 112 reports published in the general medical literature between 1978 and 1993, fewer than one study in three was deemed to have provided even a rudimentary description, e.g., age and sex distribution, range of clinical symptoms and/ or disease stage, of the patient spectrum used to investigate the potential diagnostic tool.

However, when a physician has a patient’s test outcome report in her hand and needs to reach a decision about a particular diagnosis, even knowing that the test she ordered is both highly sensitive and highly specific in the relevant subgroup of patients does not directly address the immediate problem. That is because, unless the sensitivity and specificity are simultaneously 1 – and hence figure 22.2a is apropos – regardless of what is written on the lab report, the outcome could be erroneous.

Instead of referring to the sensitivity and specificity, what our physician is really interested in knowing is the extent to which a positive (or negative) test outcome accurately predicts the true status of her patient, i.e., diseased (or dis- ease-free). This value or rate is commonly referred to as the post-test probability of disease, or the predictive value of a positive test outcome; if the test outcome is negative, then it is the post-test probability of being disease-free, or the predictive value of a negative test outcome. And these values depend on not only the sensitivity and specificity, but also on a third probability, which is known as the pretest probability or prevalence of the disease or condition.

Although there is a mathematical result, known as Bayes’ theorem, that connects sensitivity, specificity and prevalence to the post-test probability of a positive test outcome, we believe it is easier to grasp the sense of this relationship directly. Consider the following example. If a female patient has recently indicated a desire to become pregnant, then a positive pregnancy test result is fairly likely to be a true-positive result. In effect, the pretest probability of a positive pregnancy test is high because of the patient’s prior indication. Thus, when the lab report from her pregnancy test comes back positive, both the patient and her

22 Diagnostic Tests

288

Post-test probability

1.0

0.8

0.6

0.4

0.2

Positive outcome

Negative outcome

0

0

0.1

0.2

0.3

0.4

0.5

Pretest probability

Fig. 22.3. The relationship between preand post-test probability that a particular test result is correct; the plots assume a sensitivity of 0.98 and a specificity of 0.96.

physician have little reason to doubt the test result, i.e., the physician believes the test result is a true positive. On the other hand, if the patient and her doctor have recently discussed the choice and use of effective contraceptive practices because she has indicated an aversion to becoming pregnant at the present time, then a positive lab report from a pregnancy test will raise questions in the doctor’s mind about the reliability of the test, and may prompt her to request a confirmatory pregnancy test. In this case, the pretest probability of pregnancy is low, because of the recent discussion between the patient and her doctor, and hence the physician has good reason to question the reliability of the test result, i.e., the physician suspects that the test result is a false positive.

In effect, when the pretest probability or prevalence of the disease or condition is substantial, a positive test outcome is probably a correct result, and the post-test probability that a positive test outcome correctly indicates the patient is diseased will be high. The complementary post-test probability that a patient whose test outcome was positive represents a false-positive test out-

Sensitivity, Specificity, and Post-Test Probabilities

289

Post-test probability

1.0

0.8

0.6

0.4

0.2

Positive outcome

Negative outcome

0

0.5

0.6

0.7

0.8

0.9

1.0

Pretest probability

Fig. 22.4. The relationship between preand post-test probability that a particular test result is correct; the plots assume a sensitivity of 0.98 and a specificity of 0.96.

come from a disease-free individual will necessarily be low, since the only other explanation for a positive test outcome has a high post-test probability, and these two post-test probabilities associated with a positive test outcome must add to one.

Likewise, when the pretest probability or prevalence is low, a negative test outcome is most likely a correct result, and the corresponding post-test probability that a negative test outcome correctly indicates the patient is diseasefree will also be high. However, the complementary post-test probability that a negative test outcome constitutes a false-negative test result from a patient who is diseased will be low, since the only other explanation for a negative test outcome has a high post-test probability, and these two post-test probabilities associated with a negative test result necessarily add to one.

This intuitive perspective is illustrated in figures 22.3 and 22.4, for a diagnostic test with sensitivity and specificity in the relevant patient population that are estimated to be 0.98 and 0.96, respectively. The graph in figure 22.3

22 Diagnostic Tests

290

 

1.0

 

Positive outcome

 

 

 

 

 

Negative outcome

 

 

 

 

 

 

 

 

 

 

0.8

 

 

 

 

 

-test probability

0.6

 

 

 

 

 

0.4

 

 

 

 

 

Post

 

 

 

 

 

 

 

 

 

 

 

 

0.2

 

 

 

 

 

 

0

 

 

 

 

 

 

0

0.2

0.4

0.6

0.8

1.0

 

 

 

Pretest probability

 

 

Fig. 22.5. The relationship between preand post-test probability that a particular test result is correct; the plots assume a sensitivity of 0.48 and a specificity of 0.42.

shows how the post-test probability that a positive test outcome correctly indicates the presence of disease increases from a very low value, when the pretest probability is virtually negligible, to approximately 0.96 when the pretest probability is 0.5. In the meantime, the post-test probability that a negative test outcome correctly indicates disease-free status is almost a constant value, and very high, i.e., 0.98. However, as the pretest probability of disease increases from 0.5 to 1.0 – see figure 22.4 – the post-test probability that a negative test outcome correctly indicates disease-free status decreases dramatically from 0.98 to approximately 0.05, while the corresponding post-test probability that a positive test result correctly identifies that the patient is diseased is virtually unchanged, and always in excess of 0.96.

On the other hand, if the test sensitivity and specificity are low, so that their combined sum is less than one, the value of a diagnostic test, as encapsulated in the relationship between the preand post-test probabilities, is less evident. Figure 22.5 illustrates this for the case of a diagnostic test with an es-

Sensitivity, Specificity, and Post-Test Probabilities

291

timated sensitivity of 0.48 and a specificity of 0.42 in the relevant patient population. In this case, one can show mathematically (although we won’t attempt to do so here) that because the test characteristics are sufficiently unsatisfactory, knowing the test outcome – either positive or negative – has actually muddied the diagnostic waters. In each case, the post-test probability that the test outcome correctly identifies the patient’s status is lower than the corresponding pretest probability.

22.4. Likelihood Ratios and Related Issues

Although the medical literature concerning the use and interpretation of diagnostic tests often refers to sensitivity and specificity, in recent years the term likelihood ratio of a positive test result has become more common. It appears that this terminology was first introduced by Lusted [77], and was subsequently popularized in the 1990s by Sackett et al. [78]. This use of the term likelihood ratio involves a different purpose than that for which it has been used elsewhere in this book, i.e., for the testing of hypotheses. What Lusted called a likelihood ratio corresponds to the relative probability of a positive diagnostic test in a diseased individual compared with a non-diseased individual. Because this terminology is now in common use, it seems advisable to explain more fully what the likelihood ratio of a positive test result represents, and what role it plays in using and understanding diagnostic tests.

In most practical clinical settings, physicians would prefer to order a particular diagnostic test only if the result enables them to rule in or rule out a certain disease. Ruling in the disease would follow if the probability of a truepositive outcome in a diseased individual is considerably more likely than a false-positive error in a disease-free patient. Of course, these are the only two ways in which a positive test outcome could arise. The former probability is the sensitivity of the test, and the latter is the probability of a false-positive outcome, or 1 minus the specificity of the test. It is the ratio of these two probabilities that corresponds to the likelihood ratio of a positive test result. If the goal of the test is to rule in disease, this likelihood ratio should be at least one, and preferably much larger than one. Pictorially, it represents the ratio of the two areas shown in figure 22.2b and c that lie on the positive (right) side of the test outcome threshold.

Likewise, a test that enables a physician to rule out a particular disease in his patient would be one such that the probability of a true-negative outcome in a disease-free individual is substantially larger than the probability of a false-negative error in someone who is diseased. In effect, the specificity of the test is greater, and ideally much greater, than the probability of a false-negative

22 Diagnostic Tests

292

Post-test probability of disease

1.000

0.500

0.10

0.05

0.01

0.005

0.001

.001 0

0.000

250

100 0.500

50

25

 

 

10

 

 

 

 

 

0.900

 

 

5

 

 

 

 

 

0.95

 

 

2

 

 

 

 

 

 

 

 

1

 

 

 

 

 

0.99

 

 

 

 

 

 

 

 

0.995

0.5

0.25

0.1

0.05

0.025

0.01

0.001

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.999

.005

.010

.050

.100

.500

.000

0

0

 

0

 

0

 

0

1

Pretest probability of disease

Post-test probability of no disease

Fig. 22.6. The relationship between preand post-test probability that a positive (negative) test result correctly predicts disease (no disease), as a function of the likelihood ratio of a positive (negative) test outcome.

result. The ratio of these two probabilities that are associated with a negative test outcome is often called the likelihood ratio of a negative test result. It corresponds to the ratio of the two areas lying on the negative (left) side of the test outcome threshold shown in figure 22.2b and c. Ideally, this value should also be substantially bigger than one. For reasons of pedagogy, or perhaps consistency of usage, the accepted definition of the likelihood ratio of a negative test result in the medical literature is the reciprocal of the ratio described above, i.e., this likelihood ratio is the probability of a false-negative error divided by the specificity. Consequently, the values of likelihood ratios for negative test outcomes thus defined would typically be less than one, and ideally much smaller than one.

It so happens that if we know the prevalence, or pretest probability, of disease as well as these two so-called likelihood ratios, we can easily calculate the corresponding post-test probability that a positive (negative) test result is correct. Rather than introduce the two specific formulae, we have chosen to present the relationship visually, through the graphs displayed in figure 22.6. The

Likelihood Ratios and Related Issues

293

solid diagonal line and the curves plotted with short dashes indicate the explicit conversion of pretest probability to the corresponding post-test probability that a patient whose test outcome is positive is diseased, i.e., their test result correctly indicates their status. The curves plotted with short dashes lying above and to the left of the solid line represent seven different values of the likelihood ratio of a positive test result between 2 and 250, inclusive. Seven additional curves lying below and to the right of the solid line correspond to seven different values of the likelihood ratio of a positive test result between 0.001 and 0.5. The vertical and horizontal dashed lines at a pretest probability of 0.015 and a likelihood ratio of 12 illustrate how to connect an approximate post-test probability of roughly 0.20 with that particular combination of prevalence, i.e., 0.015, and diagnostic test characteristics, i.e., a positive test result likelihood ratio of 12. For example, a test having a sensitivity of 0.60 and a false-positive error rate of 0.05 would have a positive test outcome likelihood ratio of 0.60/0.05 = 12; so also would a test having a sensitivity of 0.96 and a false-positive error rate of 0.08.

Obviously, since the horizontal and (left-hand) vertical scales on the graph are identical, any test with a likelihood ratio of one has a post-test probability that is equal to the pretest probability. For such a diagnostic test, false-positive outcomes are as common as true-positive ones, and the solid line on the graph, which represents a likelihood ratio of 1, links equal values of pretest and posttest probabilities. While such a small likelihood ratio might hardly seem sensible, if the public health consequences of a false-negative test result were sufficiently catastrophic, and provided false-positive outcomes could be suitably identified by some sort of repeat test for the disease or condition, using such a test is not as foolish as it might first seem. In fact, the Guthrie test for congenital hypothyroidism that was used until quite recently to screen all newborn babies in most of the developed world had both a high sensitivity – 1 0.99 – and a large false-positive error rate, i.e., a small likelihood ratio. Clearly, the public health authorities believed that the stress, for parents, of a repeat blood test to rule out the false-positive results from an initial Guthrie test was vastly outweighed by the benefit of identifying virtually all newborn infants with this treatable condition who would otherwise develop a severe mental handicap.

It is very uncommon for a test to have a likelihood ratio for a positive test outcome that is less than one. However, in this case the likelihood ratio curves plotted in figure 22.6 below and to the right of the solid line allow evaluation of the corresponding post-test probability, which will, in fact, always be smaller than the corresponding pretest value. However, these particular likelihood ratio curves also correspond to preferred values of the likelihood ratio of a negative test outcome and can therefore be used to evaluate approximate posttest probabilities of no disease in a patient whose test outcome is negative.

22 Diagnostic Tests

294

These values are generated from the pretest probability of disease, which is located on the horizontal axis of the graph, and the likelihood ratio of a negative test outcome; the post-test probability of no disease should be read off the right-hand vertical scale, which is the reverse of its left-hand counterpart. As we remarked above, diagnostic tests that have been selected to rule out disease would typically have likelihood ratios for negative test results that are substantially less than one. By using the dashed curves found below and to the right of the solid line, readers can obtain the approximate value of the probability of no disease in a patient whose test outcome is negative. For example, a test with a specificity of 0.8 and a false-negative error rate of 0.036 has a likelihood ratio for a negative test outcome of 0.036/0.8 = 0.045. If the pretest probability of disease is 0.17, then the pair of vertical and horizontal dashed lines in figure 22.6 located at 0.17 and 0.991, respectively, identify that the post-test probability of correctly ruling out disease in a patient whose test outcome is negative is 0.991, roughly 20% greater than the corresponding pretest value of 1 – 0.17 = 0.83.

Fagan [79] published a nomogram, which is a two-dimensional graphical device, for calculating post-test probabilities from known values for the pretest probability and the relevant likelihood ratio for the diagnostic test. An adapted version of Fagan’s nomogram appears in Sackett et al. [78, p. 124], and numerous versions of both devices, both static and dynamic, can easily be located on the internet.

One of the benefits arising from an acquaintance with the notion of diagnostic test likelihood ratios is that we can explore the potential advantage of using the original test measurement that gave rise to a binary reported outcome such as reactive/non-reactive. For example, the serum ferritin concentration, which is a diagnostic test for iron deficiency anaemia, has been extensively investigated as part of a systematic review of tests to diagnose that condition. From detailed information on 2,669 patients, 809 (30%) of whom were iron-deficient, Guyatt et al. [80] estimated the likelihood ratios summarized in table 22.2.

If we assume the prevalence, or appropriate pretest probability, of iron deficiency anaemia is 0.30, we can immediately calculate the post-test probability of disease for each range of serum ferritin concentration test results listed in the table, using either the graph displayed in figure 22.6 or the actual formula that links pretest probability, the six range-specific likelihood ratios, and the corresponding post-test probabilities. The end result of these fairly simple calculations is the final row of entries in table 22.2, i.e., six post-test probabilities associated with the six different intervals that span the entire range of serum ferritin concentration measurements. And on the basis of a serum ferritin concentration measurement for a particular patient, a physician might be bet-

Likelihood Ratios and Related Issues

295

Table 22.2. The relationship between serum ferritin concentration, the likelihood ratio of a positive test outcome, and the post-test probability of disease (when the prevalence is 0.30)

 

Serum ferritin concentration, g/l

 

 

 

 

 

 

 

 

 

 

<15

15–25

25–35

35–45

45–100

6100

 

 

 

 

 

 

 

Likelihood ratio

51.9

8.8

2.5

1.8

0.54

0.08

Post-test probability

0.96

0.79

0.52

0.44

0.19

0.03

 

 

 

 

 

 

 

ter able either to rule in, or possibly rule out, iron deficiency anaemia as the appropriate diagnosis consistent with other signs and presenting symptoms observed in that patient. Or if the diagnosis was still equivocal, perhaps additional, more expensive tests might then be used to zero in on a correct diagnosis of the patient’s ailment. Although we will not attempt to explain it here, a physician who is armed with the right information could even identify the post-test probability associated with one diagnostic test result as the pretest probability for the next stage in a series of sequential steps towards a conclusive diagnosis.

The previous discussion of post-test probabilities, and their dependence on the relevant likelihood ratio associated with a positive (or negative) test outcome, may have prompted readers to realize that the problem of identifying an optimal threshold to differentiate between positive and negative test outcomes is not a purely statistical question. Rather, the definition of what is optimal depends on how the test result will be used. In the case of the simple blood test for phenylketonuria, galactosaemia, congenital hypothyroidism, cystic fibrosis and several other conditions that neonates world-wide undergo shortly after birth, the test outcome threshold is deliberately situated to ensure that virtually all infants affected by one or more of these conditions are identified. Although this choice necessarily involves a substantial false-positive rate, additional follow-up ensures that only the affected infants receive appropriate support and treatment for their disease which, thankfully, is quite rare; the prevalence rate for any of these conditions is roughly one infant in 800 births.

A similar situation holds with respect to the protocol used to screen voluntary blood donations for various transfusion-transmitted infectious agents, such as HIV-1 and -2, hepatitis B and C, and syphilis. Since each donated unit is typically tested once, and if that test result is non-reactive then the unit is processed and added to the whole blood inventory, maintaining the safety of the blood supply dictates that the false-negative error rate must be negligible.

22 Diagnostic Tests

296

The resulting test outcome threshold necessarily involves a substantial falsepositive rate for the initial screening of blood donations, and collected units that are identified as reactive are routinely discarded, although follow-up tests (usually two) may be carried out to discriminate between trueand false-pos- itive donors if the blood collection agency has a secondary, diagnostic role in its operational mandate.

As physicians become persuaded of the merits of post-test probabilities, and acquire familiarity with the concept and use of the likelihood ratio of a positive or negative test outcome, clinical investigators are beginning to design research studies that enable them to estimate these key characteristics both for familiar and new diagnostic tools. In doing so, they help steer current and future medical practice towards the goal that was first articulated by Dr. George W. Peabody [81] more than 80 years ago.

‘Good medicine does not consist in the indiscriminate application of laboratory examinations to a patient, but rather in having so clear a comprehension of the probabilities of a case as to know what tests may be of value … it should be the duty of every hospital to see that no house officer receives his diploma unless he has demonstrated ... a knowledge of how to use the results in the study of his patient.’

Likelihood Ratios and Related Issues

297

Соседние файлы в папке Английские материалы