Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Ординатура / Офтальмология / Английские материалы / Glaucoma An Open Window to Neurodegeneration and Neuroprotection_Nucci, Cerulli, Osborne_2008.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
30.63 Mб
Скачать

Fig. 1. The prevalence of glaucoma at various levels of intraocular pressure (adapted with permission from Sommer et al., 1991b).

thinning of the RNFL are associated with lower levels of probability and additional tests are helpful to raise or lower the probability.

Quantitative tests and the diagnostic process

Various quantitative tests are available to aid glaucoma diagnosis. These include standard automated perimetry (SAP) and ‘‘selective’’ tests of visual function, such as frequency doubling technology (FDT) perimetry and short-wavelength automated perimetry (SWAP), and imaging, such as confocal scanning laser ophthalmoscopy, scanning laser polarimetry, and optical coherence tomography.

There is a temptation for a busy clinician to read the output from a test (for instance, the Glaucoma Hemifield Test in Humphrey perimetry or the Moorfields Regression Analysis [MRA] in the Heidelberg retina tomograph [HRT]) and take it to be the ‘‘diagnosis.’’ However, clinicians need to remember that ‘‘devices cannot diagnose our patients’ conditions, but the findings they provide frequently alter the probability that a subject has a particular condition’’ (GarwayHeath and Friedman, 2006). Quantitative test results can be formally combined, using Bayesian statistics, to derive a probability for a disease being present. There are several steps of reasoning that

49

the clinician should go through before and after ordering a diagnostic test — deciding what is the probability of glaucoma before the test (and whether the test being ordered will usefully alter that probability), deciding whether the test result is valid, and then deciding how the test result has altered the probability that glaucoma is present.

Pretest probability

The probability of glaucoma before application of the diagnostic test can be estimated in a semiquantitative manner by combining information from the history and clinical examination. An example of a quantitative estimation of glaucoma probability on the basis of IOP measurement can be derived from data reported from the Egna–Neumarkt Glaucoma Study (Bonomi et al., 2001). With a criterion of IOP W21 mmHg, 2.1% had ocular hypertension, 1.4% had hypertensive primary open-angle glaucoma, and 0.6% had normal tension glaucoma. Therefore, 3.5% of the population had an IOP W21 mmHg. The probability of glaucoma in those with an IOP W21 mmHg is 1.4/3.5 ¼ 40%. The probability of glaucoma with an IOP o22 mmHg is 0.6/(1–0.035) ¼ 0.62%.

This estimation can act as the ‘‘pretest probability,’’ from which the ‘‘posttest probability’’ can be calculated, knowing the performance of the diagnostic test (see below).

Test validity

Before using the result of any diagnostic test, the clinician should evaluate the validity of the test. The validity depends on a number of factors: test quality and reproducibility, presence of confounding factors, and the appropriateness of the instrument reference database to the patient (Jaeschke et al., 2001).

Examples of factors affecting test quality include false–positive responses and learning effects (a particular problem for the newly-referred patient) for perimetry or scan quality for the quantitative imaging devices.

Confounding factors include central corneal thickness for IOP measurements, cataract or retinal pathology for perimetry, and image

50

artifacts or unusual anatomy (such as a tilted ONH) for quantitative imaging.

The subject age, ethnic background, and other factors, in relation to the instrument reference database, need to be considered when applying the classification algorithms. Often, the selection criteria for, and composition of, reference datasets are not readily available and caution should be exercised when interpreting classification results when this is the case.

Diagnostic test performance

The performance of a diagnostic test criterion is most often expressed as the sensitivity (true positive rate) and specificity (true negative rate).

When comparing the performance of tests, the specificity of the test diagnostic criteria have to be matched, or fixed at a certain level, so that the test sensitivity can be compared. This is illustrated below. Typically, the less specific a criterion, the more sensitive it is. Figure 2 depicts receiver operating characteristic (ROC) curves for three hypothetical tests. The ROC curve shows how the sensitivity of a test declines as the specificity increases (plotted as 1 specificity, or the false positive rate). Three pairs of comparisons are illustrated for diagnostic criteria, A, B, C1, and C2, for three diagnostic tests, a, b and c. ‘‘A’’ has a higher sensitivity than ‘‘B,’’ but in this case it does not mean that ‘‘a’’ is a more sensitive test than ‘‘b,’’ because the ROC profiles are almost the same. It is simply that the criterion for test ‘‘b’’ has a higher specificity than that for test ‘‘a.’’ Now compare diagnostic criteria ‘‘A’’ and ‘‘C1.’’ They have the same sensitivity, but ‘‘a’’ is a better test because the specificity at ‘‘A’’ is higher than it is at ‘‘C1’’; when the specificity of the criteria are matched (‘‘A’’ and ‘‘C2’’), it can be seen that the sensitivity at ‘‘C2’’ is much lower than ‘‘A.’’

When tests results are used to establish the probability that a disease is present, the ‘‘likelihood ratio’’ of a diagnostic criterion needs to be calculated. The positive likelihood ratio is (sensi- tivity)/(1–specificity) and tells us how many times more likely a positive test result is in a patient compared with a healthy individual. For instance, the HRT MRA ‘‘outside normal limits’’

 

100

 

 

 

 

 

 

 

90

 

 

 

a,b

 

 

 

 

 

 

 

 

 

 

80

 

 

 

c

 

 

 

70

 

 

 

 

 

 

Sensitivity

60

A

C1

 

 

 

 

50

 

 

 

 

 

 

40

B

C2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

30

 

 

 

 

 

 

 

20

 

 

 

 

 

 

 

10

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

0

 

20

40

60

80

100

 

 

 

 

1 - specificity

 

 

Fig. 2. Receiver operating characteristic curves for three hypothetical tests (a, b, and c). Diagnostic criterion ‘‘A’’ for test ‘‘a’’ has a higher sensitivity than criterion ‘‘B’’ for test ‘‘b’’, but lower specificity. However, tests ‘‘a’’ and ‘‘b’’ have similar diagnostic precision. Diagnostic criterion ‘‘A’’ for test ‘‘a’’ and ‘‘C1’’ for test ‘‘c’’ have similar sensitivity, but ‘‘C1’’ has lower specificity. When a criterion for test ‘‘c’’ is chosen (‘‘C2’’) to have the same specificity as ‘‘A,’’ the sensitivity is much lower. Test ‘‘c’’ has lower diagnostic precision than test ‘‘a.’’

classification has a likelihood ratio of about 19 (Medeiros et al., 2004). This means that an ‘‘outside normal limits’’ classification is 19 times more likely in a glaucoma patient than a healthy subject.

Posttest probability

The posttest probability is calculated from the pretest probability and likelihood ratio. A useful nomogram was described by Fagan (1975) (Centre for Evidence-Based Medicine, 2008), where the posttest probability can be read directly from the nomogram if the pretest probability and test likelihood ratio are known.

An example is shown for a patient with IOP W21 mmHg, undergoing imaging with the HRT, and having an MRA ‘‘outside normal limits’’ or ‘‘within normal limits’’ classification (Fig. 3). The HRT MRA ‘‘outside normal limits’’ classification has a likelihood ratio of about 19, and a ‘‘within

normal limits’’ classification has a likelihood ratio of about 0.35 (Medeiros et al., 2004).

Naturally, the performance of a test criterion will depend on the severity of glaucoma present — for a given criterion, a greater proportion of patients with more severe disease will be identified than those with less severe (early) disease. This is illustrated in Figure 4.

Thus, a test criterion may be selected to give a useful likelihood ratio for a particular stage of disease. For instance, if a single test is to be used for glaucoma screening, and the authorities will tolerate a 50% false–positive detection rate, then a diagnostic criterion has to be selected to achieve this. Given a population glaucoma prevalence of

Fig. 3. A patient with IOP W21 mmHg (pretest probability for glaucoma 40%) has an HRT MRA classification of ‘‘outside normal limits,’’ giving a posttest probability of 93% (continuous line). A similar patient, but with an HRT classification of ‘‘within normal limits,’’ has a posttest probability for glaucoma of 19% (dashed line).

51

about 2.5% (Mitchell et al., 1996), the diagnostic criterion needs to have a likelihood ratio of 40 to achieve a posttest probability of 50%. There is probably no single clinical test criterion that can attain a likelihood ratio of 40 in early glaucoma, but it may be possible to reach this value in moderately advanced disease. However, such a test criterion will have a lower likelihood ratio for early disease. A likelihood ratio of 40 equates, for example, to a sensitivity of 80% and a specificity of 98%. At an earlier stage of disease, the sensitivity may be only 40%. Thus, it may be necessary to tolerate the targeting of only more advanced disease in order to avoid too many false positive referrals.

Of course, more than one diagnostic test may be used in combination, to increase the probability for glaucoma, improve the performance of screening, and/or facilitate screening for earlier stages of disease.

Combing test results

Various diagnostic tests may be combined, provided they give largely independent information (in other words, there is no tenancy to provide similar results, other than in the presence of the target condition) (Halkin et al., 1998). When combining tests, the posttest probability of the first test becomes the pretest probability for the next test. This is illustrated in Figure 5, with two independent diagnostic tests for glaucoma being applied in a population with a prevalence of 2.5%. Criteria for the tests are selected to have a likelihood ratio of 8 for the first test (sensitivity 80%, specificity 90%) and 5 for the second test (sensitivity 95%, specificity 81%). After the first test, with a positive test result, the probability for glaucoma has risen from 2.5% to 17%. After the second test, with a positive result, the probability for glaucoma has risen to 51%.

Diagnostic tests

Standard automated perimetry

Visual-field testing is essential to establish the extent of vision loss in glaucoma and to monitor

52

Likelihood ratio

5

4.5

4

3.5

3

2.5

2

1.5

1

0.5

0

-1

-0.5

0

0.5

1

1.5

Disease Severity (neural rim loss)

Fig. 4. Likelihood ratio values for a visual field mean deviation criterion at various stages of glaucoma, defined by the extent of neural rim loss at the optic-nerve head (adapted with permission from Stroux et al., 2003).

Fig. 5. Combining diagnostic test results. The panels represent the application of two diagnostic tests, in a population with a glaucoma prevalence of 2.5%. The first test has a likelihood ratio of 8 and the second a likelihood ratio of 5. The final probability is 51%.

for progression. In cases where the diagnosis is not certain from the history and clinical examination, the field test provides data that raise or lower the probability for glaucoma. This can be done formally, using the quantitative data reported by the test (such as the mean deviation and pattern standard deviation) and knowing the performance of the test (Stroux et al., 2003). However, additional information, deriving from the distribution of abnormal test points within the visual field, further influences the probability that glaucoma is present. The glaucoma hemifield test makes a quantitative comparison of the differential light sensitivity in regions of the upper and lower hemifields. The experience of the clinician is also valuable in assessing the distribution of abnormal points — the glaucomatous neuropathy is associated with characteristic patterns of visual-field loss, such as the arcuate distribution and nasal step, and artifacts, such superior defects related to lid ptosis, also have a characteristic appearance. Thus, given current data interpretation software, the evaluation of the visual-field test result cannot be entirely automated.

There is a widely held belief that SAP is not a sensitive test in early glaucoma. This stems from reports that a large proportion of retinal ganglion cells may be lost before the visual field becomes statistically abnormal (Quigley et al., 1988; Kerrigan-Baumrind et al., 2000) and that evidence of structural damage (ONH changes and RNFL loss) may be seen in some patients in the presence of a visual-field test ‘‘within normal limits’’ (Sommer et al., 1991a; Mohammadi et al., 2004). This gave rise to the idea of a ‘‘functional reserve’’ of ganglion cells. However, there is a growing body of evidence that there is no functional reserve, but a continuous structure/function relationship, so that the measured function relates directly to the number of retinal ganglion cells (Garway-Heath et al., 2000; Swanson et al., 2004). The implication is that structural and functional damage occurs at the same time, so that when a ganglion cell dies, some function is lost (Garway-Heath et al., 2002; Harwerth et al., 2004; Harwerth and Quigley, 2006). There are several factors that may disturb this one-to-one relationship, such as retinal ganglion cell dysfunction and media opacity, which may

53

result in lower measurements of function than would be expected from the measurement of structure, and architectural changes to the ONH or RNFL structure which may not be directly related to ganglion cell loss.

The early identification of glaucoma, statistically, is limited by between-subject variability, so that 40–50% of retinal ganglion cells need to be lost before the visual function loss exceeds the 95% confidence limits for normality in the population (Harwerth et al., 2004). Similar findings are seen with between-subject variability in structural measurements, with the lower 98% confidence limit for the normal range of ONH neural rim area being about 65% of the average value — suggesting that 35% of the rim area needs to be lost before it becomes smaller than the lower end of the normal range (Garway-Heath and Hitchings, 1998a). This means that there needs to be a substantial amount of neural tissue loss before either structural or functional measurement fall below the statistically defined normal ranges. Thus, depending on the method for measurement and the individual, some eyes will have measurable damage first by structural measurements, whereas with another measurement method or in another individual, functional loss will manifest first.

There are many studies in the literature reporting structural damage evident years before visualfield loss (Sommer et al., 1991a; Mohammadi et al., 2004), but this does not mean that this is the rule. No studies have addressed the question the other way around — in other words, no study has followed a group of patients with visual-field loss and apparently normal structure to see how long it takes for the structural damage to become evident (i.e. looking for evidence of functional loss preceding structural loss). That there are such patients is evident from the many cross-sectional studies evaluating the sensitivity and specificity of imaging devices. Most studies find that, when test specificity is fixed at around 95%, the test sensitivity is around 70% (Medeiros et al., 2004). This means that around 30% of eyes with early visual-field loss have structural measurements within the normal range. Some may argue that imaging devices are not as sensitive to early structural damage as clinicians evaluating the