Добавил:

Sekretar kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Ростовский Государственный Медицинский Университет

Предмет:

Медицина общая

Файл:

Study Design and Statistical Analysis a practical guide for clinicians_Katz _2006

.pdf

Скачиваний:

Добавлен:

28.03.2026

Размер:

669.78 Кб

Скачать

☆

<<< < Предыдущая 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1718 / 2018 19 20 > Следующая >>>

159 Bias

association between having a case manager and receiving supportive services among HIV-infected persons. Advocates have used these studies as justiﬁcation for funding case management programs, pointing out that having a case manager results in patients receiving needed services. However, these studies were vulnerable to the criticism of reverse causality, speciﬁcally the possibility that receiving services led to getting a case manager (because many service organizations automatically assign case managers to patients who request services).

To resolve this issue colleagues and I used a longitudinal probability sample of HIV-infected persons (HIV Cost and Services Utilization Study, HCSUS).152 We identiﬁed two groups: (1) subjects with unmet needs and case managers at baseline and (2) subjects with unmet needs and no case managers at baseline. We found that contact with a case manager at baseline was associated with a higher likelihood that unmet needs were fulﬁlled by the time of the follow-up visit. By requiring that the case manager be in place prior to the unmet need being fulﬁlled, we excluded the possibility that receiving services resulted in getting a case manager and thereby strengthened the argument that there was a causal relationship between having a case manager and receiving needed services.

Even with longitudinal studies, reverse causality may be operating if the disease you are studying has a subclinical form. This is why it is important to intensively screen for subclinical disease at the start of a study. For example, in Section 2.3.A I discussed the evidence supporting a relationship between participating in challenging cognitive activities and not developing dementia. But what if effect–cause is operating? Could it be that persons with undiagnosed dementia are less likely to engage in challenging cognitive activities? When such people are observed years later the dementia has progressed and the lack of engagement in challenging cognitive activities is assumed to be one of the reasons. To guard against this possibility, the investigators tested all subjects at baseline for dementia using a standardized instrument that closely correlates with the stages of Alzheimer’s disease.

9.1.G Exclude bias

Of potential threats to causality, bias can be the most difﬁcult to assess because there are so many sources of potential bias. Remember from Section 1.1 that bias is systematic error in the design or execution of a study.153 Selection bias may

152Katz, M.H., Cunningham, W.E., Fleishman, J.A., et al. Effect of case management on unmet needs and utilization of medical care and medications among HIV-infected persons. Ann. Int. Med. 2001; 135: 557–65.

153For more on bias, see Szklo, M., Nieto, F.J. Epidemiology: Beyond the Basics. Gaithersburg, Maryland: Aspen Publication, pp. 125–76; Hulley, S.B., Cummings, S.R., Browner, W.S., Grady, D., Hearst,

N., Newman, T.B. Designing Clinical Research (2nd edition). Philadelphia: Lippincott Williams & Wilkins, 2001, pp. 126–8.

160 Statistics and causality

occur in sampling of subjects or assignment to study groups (e.g., sicker persons being steered to a particular treatment group); bias may occur due to subjects with a disease being more likely to remember exposures (recall bias) or due to subjects answering questions the way they think the investigators want them to (i.e., social desirability bias); bias may occur due to interviewers probing more deeply with subjects they think likely to have had an exposure; observer bias occurs when the investigator draws a conclusion about a participant based on collateral information about the patient (e.g., investigator assumes that an AIDS patient is taking zidovudine because the patient has an elevated MCV level).

The best way to minimize bias is through careful study design. However, even if you perform a randomized placebo-controlled trial there are still potential sources of bias (e.g., subjects submitting their pills to a private laboratory to unblind their assignment). As a researcher, all you can do is minimize the sources of bias, test the impact of bias in your study (e.g., if study dropout is high among older persons, test your results in younger persons; if the association holds then you know it cannot be due solely to bias due to dropout among older persons); and honestly report the biases of your study.

9.1.H Strengthening causal associations: putting it all together and getting it wrong!

The association between estrogen use and Alzheimer’s disease provides a perfect example of how to strengthen causal associations and get it wrong!

Five observational studies showed that estrogen use was associated with decreased development of Alzheimer’s disease (prior research).154 Estrogen is known to have positive effects on the brain including reducing beta-amyloid accumulation, enhancing neurotransmitter release and action, and protecting against oxidative damage (biologic plausibility).155 The prospective longitudinal study performed by Tang and colleagues carefully evaluated subjects on enrollment to exclude incipient Alzheimer’s disease (exclude reverse causality). All ﬁve of the studies used multivariable analysis to control for possible confounders such as age, education, ethnicity, age at menarche, age at menopause, and apolipoprotein E genome (exclude confounding). To test for bias due to

154Tang, M.-X., Jacobs, D., Stern, Y., et al. Effect of oestrogen during menopause on risk and age at onset of Alzheimer’s disease. Lancet 1996; 348: 429–32; Baldereschi, M., De Carlo, A., Lepore, V., et al. Estrogen-replacement therapy and Alzheimer’s disease in the Italian longitudinal study on aging. Neurology 1998; 50: 996–1002; Zandi, P.P., Carlson, M.C., Plassman, B.L., et al. Hormone replacement therapy and incidence of Alzheimer disease in older women. J. Am. Med. Assoc. 2002; 288: 2123–9; Paganini-Hill, A., Henderson, V.W. Estrogen deﬁciency and risk of Alzheimer’s disease in women. Am. J. Epidemiol. 1994; 140: 256–61; Kawas, C., Resnick, S., Morrison, A., et al. A prospective study of estrogen replacement therapy and the risk of developing Alzheimer’s disease: The Baltimore Longitudinal Study of Aging. Neurology 1997; 48: 1517–21.

155Yaffe, K. Hormone therapy and the Brain: Déjà vu all over again? J. Am. Med. Assoc. 2003; 289: 2717–18.

161 Statistically signiﬁcant and clinically unimportant results

excluding women with Parkinson’s disease or stroke, Tang and colleagues compared hormone use among excluded women to that of women included in the study and found no differences (exclude bias). The protective effect was strong (OR 0.33) in the study by Baldereschi and colleagues (strength of effect). Three studies (Tang and colleagues, Paganini-Hill and Henderson, and Zandi and colleagues) found an association between longer duration of estrogen use and decreased incidence of Alzheimer’s disease (dose–response relationship).

However, when a randomized clinical trial was completed, it showed that estrogen plus progestin therapy actually increased the risk of dementia.156 How could the observational studies been so wrong? The reason for the discrepancy between the observational data and the randomized controlled trial is unknown. The most likely explanation is confounding due to an unmeasured factor such as healthful life-style behavior.

9.2 Can the results be statistically signiﬁcant and clinically unimportant?

You are more likely to correctly characterize a population if you assess a large number of its members than if you assess a small number of members.

Absolutely! The reason is that statistical signiﬁcance is heavily affected by sample size. If you have any doubt remember the coin toss example (Section 1.1). Having 60% of the tosses land on heads is sufﬁcient evidence to conclude the coin is equally weighted if you have 100 tosses but not if you only have 10 tosses.

Why is sample size such an important determinant of statistical signiﬁcance? The reason is that you are more likely to correctly characterize a population if you assess a large number of its members than if you assess a small number of members.

However, correctly characterizing a population does not mean that the results are important. For example, Flum and colleagues examined the records of 1,570,361 Medicare patients who underwent cholecystectomy during a 7-year period.157 The investigators compared those patients who underwent an intraoperative cholangiography (IOC) to those who did not. (Performance of IOC is thought to increase the risk of common bile duct injury.) There were many statistically signiﬁcant differences between patients who underwent IOC and those who did not (Table 9.3).

In fact, of the 12 comparisons shown in Table 9.3, nine are statistically signiﬁcant at the P 0.001 level and two are statistically signiﬁcant at the P 0.05. But are these differences important? No, most seem trivial. For example, 96.8%

156Shumaker, S.A., Legault, C., Rapp, S.R., et al. Estrogen plus progestin and the incidence of dementia and mild cognitive impairment in postmenopausal women. J. Am. Med. Assoc. 2003; 289: 2651–62.

157Flum, D.R., Dellinger, E.P., Cheadle, A., Chan, L., Koepsell, T. Intraoperative cholangiography and risk of common bile duct injury during cholecystectomy. J. Am. Med. Assoc. 2003; 289: 1639–44.

162	Statistics and causality
	Table 9.3. Characteristics of patients with and without intraoperative
	cholangiography (IOC)

		With IOC		Without IOC
	Variables	(N 613,706)		(N 956,655)		P-value

	Patient-level variables
	Age, mean (SD), (years)	71.7	(10.3)	71.2	(10.7)	0.001
	Sex, (% of female)	62.6		63.2		0.001
	Race, (% of white/non-Hispanic)	88.9		88.8		0.05
	Complex biliary tract disease, (%)	10.9		11.0		0.05
	Comorbidity index, mean (SD)	0.04 (0.22)		0.08 (0.24)		0.001
	Surgeon-level variables
	Age, mean (SD), (years)	48.1	(9.3)	48.6	(9.6)	0.001
	Sex, (% of male)	96.8		96.7		0.001
	Percent performed in the surgeon’s	24.6		25.0		0.001
	ﬁrst 20 cholecystectomies
	Case order, mean # (SD)	70.5 (61.3)		66.6	(57.7)	0.001
	General surgeon/surgical specialist	95.6		95.6		1.0
	Surgeon board certiﬁed, (%)	82.6		79.6		0.001
	Years since surgeon graduated from	21.8	(9.6)	22.3	(9.6)	0.001
	medical school, mean (SD), (years)

Data from Flum, D.R., et al. Intraoperative cholangiography and risk of common bile duct injury during cholecystectomy. J. Am. Med. Assoc. 2003; 289: 1639–44.

of patients who underwent IOC had a male surgeon versus 96.7% of patients who did not have an IOC. Although the difference is a trivial 0.1%, the difference is statistically signiﬁcant at the P 0.001 level. What is driving the statistical signiﬁcance is the large sample size. Almost any difference no matter how trivial will be statistically signiﬁcant if you have 1.5 million subjects!

Besides large sample sizes, very sensitive measures can lead to statistically signiﬁcant, but clinically unimportant results. For example, a study of Alzheimer’s disease found that patients given the medicine tacrine had statistically signiﬁcant improvements on a scale very sensitive to cognitive changes (the cognitive scale of the Alzheimer’s Disease Assessment) compared to patients who were given placebo. However, tacrine was not associated with improvements using more global measures of function such as the MiniMental State Examination.158 Due to its very limited beneﬁt, tacrine is not widely prescribed for patients with Alzheimer’s disease.

158Qizilbash, N., Birks, J., Lopez Arrieta, J., Lewington S., Szeto, S. Tacrine for Alzheimer’s disease (Cochrane Review). In: The Cochrane Library (Issue 3). 2003, Oxford: Update Software.

163

Tip

Make sure your effect size is clinically important before undertaking your study.

Statistically insigniﬁcant and clinically important results

The best way to avoid a situation of having a statistically signiﬁcant, but clinically unimportant result is to set an effect size a priori that is clinically important. Although this sounds obvious, much more attention is paid in both study design and study interpretation to the issue of statistical signiﬁcance than to clinical signiﬁcance.159

9.3 Can the results be statistically insigniﬁcant and clinically important?

Tip

When clinically important differences do not reach statistical signiﬁcance report the ﬁnding, but indicate that the difference did not reach statistical signiﬁcance.

Also: absolutely! There is nothing sacred about the conventionally used P-value of 0.05. There is no reason be dramatically more conﬁdent of a result that is signiﬁcant at a P-value of 0.05 than a P-value of 0.06.

One way to avoid judging results based on a single threshold is to focus on the conﬁdence intervals rather than the signiﬁcance levels. The conﬁdence intervals give you a sense of the range of results compatible with your data (Section 4.3). However, some people make the same mistake with conﬁdence intervals as with P-values. That is, they dismiss any effect where the 95% CI don’t exclude 1.0.

On the other hand, there does need to be some widely accepted threshold for deciding when chance is an unlikely explanation for a result. Otherwise, investigators would be tempted to move that threshold around, after the fact, to call their results statistically signiﬁcant.

When you have a clinically important difference that does not reach statistical signiﬁcance but is close to the conventional cut-off (e.g., P 0.07 or the 95% CI includes one but excludes 0.98) report the ﬁnding, but indicate to the reader that it did not reach statistical signiﬁcance.

For example, Kadish and colleagues tested the ability of an implantable cardioverter-deﬁbrillator (ICD) to prevent deaths among patients with severe heart disease.160 They randomized 458 patients with non-ischemic dilated cardiomyopathy, left ventricular dysfunction, and evidence of arrhythmias to receive standard medical therapy alone versus standard medical therapy plus a single-chamber ICD. Using proportional hazards regression, they found that the ICD group was less likely to die (relative hazard 0.65). However, the 95% CI included 1 (0.40–1.06) and the P-value was 0.08.

Does this mean that ICDs do not save lives? No. What it does mean is that the study was underpowered for this outcome. When the investigators calculated their sample size they assumed that more than 50% of the deaths in the standard-therapy group would occur due to an arrhythmia. However, in the

159Man-Son-Hing, M., Laupacis, A., O’Rourke, K., et al. Determination of the clinical importance of study results. J. Gen. Int. Med. 2002; 17: 469–76.

160Kadish, A., Dyer, A., Daubert, J.P., et al. Prophylactic deﬁbrillator implantation in patients with non-ischemic dilated cardiomyopathy. New Engl. J. Med. 2004; 350: 2151–8.

164 Statistics and causality

study, only a third of the deaths in the standard-therapy group were due to an arrhythmia. When the investigators used a more speciﬁc marker (Section 7.12) of the efﬁcacy of ICD (sudden death due to an arrhythmia) they found a statistically signiﬁcant decrease in deaths due to arrhythmias among the ICD recipients (relative hazard 0.20; 95% CI 0.06–0.71; P 0.006).

On the other hand, some investigators mistakenly assert that their nonsigniﬁcant ﬁndings should be accepted as truth because if the sample size had been bigger, the P-value would have been statistically signiﬁcant and the conﬁdence intervals would have excluded 1.0. Although it is true that for a given effect size, a larger sample size will result in a smaller P-value (tossed coin example, Section 1.1) and narrow the conﬁdence intervals, statistical signiﬁcance testing takes into account the degree of uncertainty in the effect size at a given sample size. A larger sample size will result in less uncertainty but may also result in a different point estimate.

Special topics

10.1 What is the difference between the relative risk and the absolute risk?

Absolute risk is more helpful in clinical situations than relative risk.

Relative risks (risk ratios and rate ratios (RR)) identify the risk factors for particular outcomes. However, they cannot tell you how likely an outcome is to occur, only how much more likely the outcome is to occur in one group than the other. Therefore, knowing the relative risk is not very helpful in clinical situations. In contrast, an absolute risk tells you how likely an outcome is to occur.

The difference between the relative risk and absolute risk is particularly great with rare diseases because a person at high relative risk of developing a disease (compared to an unexposed person) may still be very unlikely to develop that disease. For example, the relative risk of developing esophageal cancer is 40–125 higher among persons with Barrett esophagus. For persons newly diagnosed with Barrett esophagus this must sound like a certainty that they will develop cancer. In fact, the absolute risk of developing cancer if you have Barrett esophagus has been estimated at 0.5% per year (one in two hundred).161 Despite the high relative risk, the absolute risk is low because esophageal cancer is a rare disease.

10.2 What other effect measures are available in addition to relative risk and absolute risk?

In addition to relative risk and absolute risk, several related effect measures are available. Each one characterizes the association between a risk factor and an outcome differently. The different measures, along with their meaning, and their uses, are shown in Table 10.1.

161Shaheen, N., Ransohoff, D.F. Gastroesophageal reﬂux, Barrett esophagus, and esophageal cancer. J. Am. Med. Assoc. 2002; 287: 1972–81.

165

166	Special topics
Table 10.1. Comparison of different measures of effect

Effect measure	Meaning	Use

Absolute risk difference	Incidence of disease that can be	Understand differences in risk due to
(attributable risk)	attributed to a particular exposure	differences in exposures
Attributable fraction	Proportion of disease due to a	Understand importance of a particular
	particular exposure	factor on disease occurrence
Population attributable	Incidence of disease due to a	Helpful in targeting public health
fraction	particular exposure in a community	interventions
Number needed to treat	Number of persons needed to be	Helpful in deciding whether it is worth
	treated to prevent one outcome	adopting a clinical intervention

10.2.A Absolute risk difference
	The absolute risk difference is the difference in the incidence between two
	groups:
	absolute risk
	absolute risk	incidence among		incidence among
	difference
	difference		exposed
			exposed	unexposed
	Assuming that there is a causal relationship between the exposure and the
Deﬁnition
Deﬁnition	outcome, the absolute risk difference tells you how much of the incidence of the
Attributable risk tells
Attributable risk tells	disease is due to (can be attributed to) the exposure. For this reason it is also
you how much of the
incidence of a disease	referred to as the attributable risk or the attributable risk in exposed persons.
can be attributed to a	In Section 5.9.A I reviewed a study comparing the risk of community-acquired
particular exposure.
particular exposure.	pneumonia among patients exposed to acid suppressing drugs compared to per-


	sons not exposed. The investigators found that the incidence of pneumonia in
	patients exposed to acid suppressing drugs was 2.45 per 100 person years
	(185/7562 100) and the incidence of pneumonia in unexposed patients was
	0.55 per 100 person years (5366/970,331 100). Therefore, the attributable
	risk (attributable to acid suppression medication) is 1.9 cases (2.45 0.55) per
	100 person years.

10.2.B Attributable fraction (attributable risk percentage)

The attributable fraction (also known as the attributable risk percentage) tells us the proportion of a disease that is due to a particular exposure, assuming that

167	Attributable fraction

the exposure causes the disease.162 It is calculated as:

attributable		incidence among exposed incidence among unexposed
fraction		incidence among exposed

Incidence in the formula can be incidence rate or incidence proportion. Continuing with the example of acid suppressing drugs and pneumonia, the

attributable fraction would be:

2.45 0.55 0.78

2.45

In other words, 78% of the pneumonias that developed among the patients in the study can be attributed to acid suppressing drugs. This may seem very high to you because you are thinking that the attributable fractions for all the causes of pneumonia should add up to 100%. This is incorrect. The attributable fractions can exceed 100% because multiple causes can interact and result in disease (e.g., acid suppressing drugs in the setting of exposure to pneumococcus can cause pneumonia).163

This attributable fraction can also be stated in terms of RR, speciﬁcally:

attributable fraction RR 1.0 RR

To prove that the two ways of stating the attributable fraction are equivalent calculate the attributable fraction in terms of the RR. In Section 5.9.A we had calculated that the RR associated with exposure to acid suppressing drugs was

4.5.Therefore, he unadjusted attributable fraction would be:

4.51.0 0.78

4.5

One advantage to the formula calculating attributable risk from the risk ratio is that the formula can be generalized so that you can approximate the attributable fraction from the odds ratio when it can be considered an approximation of the risk ratio (Section 5.2).

162Some authors deﬁne the attributable risk in the way I have deﬁned the attributable fraction. It is best not to get distracted by the confusing nomenclature, and instead focus on the meaning of the comparison you are making.

163In fact, the sum of the attributable fractions is bounded by inﬁnity. For more on this somewhat counter-intuitive idea see Rothman, K.J., Greenland, S. Modern Epidemiology (2nd edition). Philadelphia: Lippincott, Williams & Wilkins, 1998, pp. 12–14.

168	Special topics

attributable fraction* OR 1.0 OR

*Assuming outcome is uncommon ( 10–15%)

This is very useful when you have performed logistic regression and have an odds ratio rather than a relative risk for a given exposure.

10.2.C Population attributable fraction

Population attributable fraction tells us the proportion of a disease that is due to a particular exposure in a population, assuming that the exposure causes the disease. This metric incorporates the prevalence of the risk factor such that interventions that decrease common risk factors reduce disease more than interventions that eliminate uncommon risk factors. Stated in a different way: if you had two interventions that halved the incidence of a particular disease, the intervention that decreased the more common risk factor would have a more powerful effect in the community than the intervention that eliminated the less common risk factor. The formula for population attributable fraction164 is:

population		incidence in population incidence in unexposed
attributable fraction		incidence in population

As with attributable fraction, incidence can be based on incidence rates or incidence proportions. The above formula can be rewritten mathematically165 to more easily see the impact of the prevalence of the risk factor on the population attributable fraction:

		(prevalence of risk
population		factor in the population) (RR 1)
attributable fraction		[(prevalence of risk
		factor in the population) (RR 1) 1]

The differences between risk ratios, attributable fraction, and population attributable fraction are illustrated by a population-based study of risk factors for uncontrolled hypertension (Table 10.2).166 You can see that based on the relative risks, having no medical care is a stronger predictor of uncontrolled hypertension than being male. However, because only 10% of the sample had

164For more on attributable risk and population attributable risk see Kelsey, J.L., Whittemore, A.S., Evans, A.S., Douglas Thompson, W. Methods in Observational Epidemiology (2nd edition). Oxford: Oxford University Press, 1996, pp. 37–40.

165To see how: Szklo, M., Nieto, F.J. Epidemiology: Beyond the Basics. Gaithersburg, Maryland: Aspen Publication, pp. 101–5.

166Hyman, D.J., Pavlik, V.N. Characteristics of patients with uncontrolled hypertension in the United

States. New Engl. J. Med. 2001; 345: 479–86.

<<< < Предыдущая 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1718 / 2018 19 20 > Следующая >>>

Соседние файлы в папке Английские материалы

#
28.03.202682.72 Mб0Stereo Atlas of Fluorescein and Indocyanine Green Angiogrphy_Stevens, Saine, Tyler_1999.pdf
#
28.03.20261.9 Mб0Stereoatlas of Ophthalmic Pathology Anatomy and Pathology of the Peripheral Fundus_Meyer, Loeffler_2005.pdf
#
28.03.20264.34 Mб0Strabismus A Decision Making Approach_Von Noorden, Helveston_1994.pdf
#
28.03.202618.74 Mб0Strabismus Surgery and Its Complications_Coats, Olitsky_2007.pdf
#
28.03.202610.21 Mб0Studies on Retinal and Choroidal Disorders_Stratton, Hauswirth, Gardner_2012.pdf
#
28.03.2026669.78 Кб0Study Design and Statistical Analysis a practical guide for clinicians_Katz _2006.pdf
#
28.03.202621.55 Mб0Surgery for the Dry Eye_Geerling, Brewitt_2008.pdf
#
28.03.202620.09 Mб0Surgical Anatomy of the Ocular Adnexa A Clinical Approach_2nd edition_Jordan, Mawn, Anderson_2012.pdf
#
28.03.202622.95 Mб0Surgical Atlas of Orbital Diseases_Mallajosyula_2009.pdf
#
28.03.20266.71 Mб0Surgical Management of Inflammatory Eye Disease_Becker, Davis_2008.pdf
#
28.03.202650.73 Mб0System for Ophthalmic Dispensing 3rd edition_Brooks, Borish_2006.pdf