Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Ординатура / Офтальмология / Английские материалы / Using and Understanding Medical Statistics_Matthews, Farewell_2007

.pdf
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
3.03 Mб
Скачать

Mean INR value

5

 

 

 

 

 

 

 

X

Reagent 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4

 

 

X

X

Reagent 2

 

 

 

 

 

 

 

 

 

 

X

 

 

Patient 37

 

 

 

 

 

 

 

X

X

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2

 

 

 

 

 

 

 

X

Reagent 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

X

X

 

 

 

 

 

 

X

Reagent 2

 

 

Patient 27

 

 

 

 

 

 

 

X

X

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

2

3

 

Machine

 

Fig. 15.3. Predicted mean INR values for patients 27 and 37, based on the ANOVA summarized in table 15.9.

sion of the Machine ! Reagent two-factor interaction relaxes those mathematical restrictions, allowing the six estimated means identified with the six machine/reagent combinations to conform to the dictates of the study data.

Figure 15.3 displays estimated mean INR values obtained from the fitted regression model for each machine and reagent combination. The two sets of values plotted on the graph correspond to those for patients 27 and 37, whose predicted mean INR values represented the extremes occurring in the study. The solid and dashed lines have been added only to enhance visual appreciation; obviously, values for Machine between the labels 1, 2 and 3 have absolutely no sensible meaning. From the estimated means denoted by the character ‘X’ on the plot, which is usually called an effect graph, it should be amply evident to readers that the patient-to-patient variability is the single largest source of systematic variation amongst the INR values observed in the study.

15 Analysis of Variance

188

Table 15.9. A revised version of table 15.8 that includes the significant Machine ! Reagent two-factor interaction

Term

SS

DF

MS

F

Significance

 

 

 

 

 

level

 

 

 

 

 

 

Between Patients

16.230

58

0.280

 

 

Within Patients

 

 

 

 

 

Machine

1.581

2

0.790

239.6

<0.001

Reagent

2.598

1

2.598

787.5

<0.001

Machine ! Reagent

0.087

2

0.044

14.3

<0.001

Residual

0.877

290

0.003

 

 

 

 

 

 

 

 

Total

21.373

 

 

 

 

 

 

 

 

 

 

Nonetheless, by designing their study carefully, and by involving a sufficiently large number of patients, the investigators have been able to uncover certain aspects of the complex process underlying INR measurement that are small, numerically, but important, scientifically.

Look carefully at the estimated mean INR values for each of the patients shown in figure 15.3. Regardless of which patient we consider, if the explanatory variables in the fitted regression model had involved only the main effect for Reagent rather than the main effect and the two-factor interaction of which Reagent was a part, i.e., X1, X4 and X5, the two lines displayed on the effect graph would have been separated by the same distance, whether the machine was labelled 1, 2 or 3. But the two lines for patient 37 clearly aren’t parallel. Neither is the pair of lines displayed for patient 27 since the separation between such lines is identical regardless of the patient considered. The change in mean INR that results when reagent 2 is used rather than reagent 1, i.e., the vertical distance between the solid and dashed lines associated with a particular patient, is evidently greater when INR values are measured on machine 3 than when either machine 1 or machine 2 are used. The same pattern is also evident in figure 15.1, where we can see that the amount by which mean INR changes when reagent 2 is used rather than reagent 1 also depends on which machine – 1, 2 or 3 – is used to measure INR. The effect of the change in reagent appears to be least when machine 1 is in use, and greatest on machine 3. This is a consequence of the significant two-factor interaction, and represents a physical or graphical interpretation of what this interaction means.

Even though ANOVA can be regarded as a specialized case of multiple linear regression, it is unlikely that one would choose to summarize the study results by providing a table of estimated regression coefficients and corre-

Revisiting the INR Study

189

sponding estimated standard errors. Instead, an ANOVA summary like table 15.9, and visual displays of the estimated effects, like figures 15.1 or 15.3, are usually presented. Although the final conclusions from this study are perhaps unsettling, scientifically, because of the subtle measurement effects that the data reveal, they highlight the fact that careful study design, combined with equally careful analysis of the resulting data, can provide important insights into complex processes.

Of course, a single example like the INR study that we have described hardly provides sufficient scope to address all the possible uses for analysis of variance methods. Nevertheless, we hope that this longer introduction has enabled readers to develop an appreciation of ANOVA that can serve them well in future encounters with this widely-used, well-developed set of statistical tools.

15 Analysis of Variance

190

16

U U U U U U U U U U U U U U U U U U U U U U U U U U U

Data Analysis

16.1. Introduction

In the preceding chapters, we have discussed a number of statistical techniques which are used in the analysis of medical data. It has generally been assumed that a well-defined set of data is available, to which a specific procedure is to be applied. In this chapter, we adopt a broader perspective in order to address some general aspects of data analysis.

There is a necessary formalism to most statistical calculations which is often not consistent with their application. While the formal properties of statistical tests do indicate their general characteristics, their specific application to a particular problem can require adaptation and compromise. Data analysis, perhaps, is as much an art as it is a science.

Experience is the only good introduction to data analysis. Our aim, in this chapter, is to highlight a few principles with which the reader should be familiar. These should promote a more informed reading of the medical literature, and lead to a deeper understanding of the potential role of statistics in personal research activity. Where possible, we will use examples for illustration although, since they are chosen for this purpose, these examples may be simpler than many genuine research problems. Also, any analysis which we present should not be considered definitive, since alternative approaches may very well be possible.

16.2. Quality Data

‘Garbage in, garbage out’ is an apt description of the application of statistics to poor data. Thus, although it may be obvious, it is worth stressing the importance of high-quality data.

If information is collected on a number of individuals, then it is critical that the nature of the information be identical for all individuals. Any classification of patients must follow well-defined rules which are uniformly applied. For example, if a number of pathologists are classifying tumors, there should be a mechanism to check that the classification is consistent from one pathologist to another. This might involve a re-review of all slides by a single pathologist, or selected cases could be used as consistency checks. Formal statistical methods to examine data collected for the evaluation of such consistency are discussed in chapter 23. Here, however, we simply emphasize that identifying effective methods for primary data collection should be an important objective.

Of course, it is possible that data may be missing for some individuals. Provided a consistent effort has been applied to collect data, allowance for the missing data can frequently be made in a statistical analysis. Even then, however, if there are observable differences in a response variable between individuals with and individuals without particular information, any conclusions based only on those individuals with available data may be suspect. Considerable efforts have been directed towards developing methods to deal appropriately with missing data, but describing such methods is beyond the scope of this book. These techniques typically depend on assumptions that often cannot be verified. Therefore, the most effective way of dealing with missing data is to devote considerable effort to ensure that the amount of missing data is minimized.

Two major types of data collection can be identified; we shall call the two approaches retrospective and prospective. Retrospective data collection refers to data that were recorded at some previous time and subsequently are selected to be used for research purposes. The quality of retrospective data is often beyond the control of the investigator. Good detective work may provide the best information available, but what is available may vary widely from individual to individual. For retrospective data, classification frequently must be based on the greatest amount of information which is available on all patients. For example, in the 1970s it was shown that prior blood transfusions were associated with a poorer prognosis for aplastic anemia patients undergoing bone marrow transplantation. Patient records from the time prior to their arrival at the transplant center contained varying details on transfusion histories. As a result, early studies were necessarily limited to a simple binary classification indicating whether or not any blood transfusions had been used, even though, for some patients, the number of units of blood transfused could be identified.

Prospective data collection generally occurs in a well-designed study. In such a situation, specified information is identified to be of interest, and this

16 Data Analysis

192

Table 16.1. Some information collected by questionnaires from 180 pregnant women

1Patient number

2Back pain severity

(0)‘nil’

(1)‘nothing worth troubling about’

(2)‘troublesome, but not severe’

(3)‘severe’

3Age of patient (years)

5Height of patient (meters)

6 Weight of patient at start of pregnancy (kg)

7 Weight of patient at end of pregnancy (kg)

8Weight of baby (kg)

9Number of children by previous pregnancies

10Does the patient have a history of backache with previous pregnancy?

(1)‘not applicable’

(2)‘no’

(3)‘yes, mild’

(4)‘yes, severe’

13 Does walking aggravate back pain? (no/yes)

information is collected as it becomes available. The problem which arises in this type of study is one of ensuring that the information is, in fact, recorded at all, and is recorded accurately. In large collaborative studies this is a major concern, and can require considerable staff over and above the necessary medical care personnel.

Some additional aspects of data collection will be mentioned in chapter 18, which discusses the design of medical studies. In the rest of this chapter, only analyses of available data will be considered.

16.3. Initial or Exploratory Analysis

Before any formal statistical analysis can begin, the nature of the available data and the questions of interest need to be considered. In designed studies with careful data collection, this phase of analysis is simplified. It is always wise, however, to confirm that data are what they should be, especially if subsequent analyses involve computer manipulation of the data.

Table 16.1 presents a subset of some information obtained by questionnaires from 180 pregnant women [32]. The data were collected to study back pain in pregnancy and, more particularly, to relate the severity of back pain to

Initial or Exploratory Analysis

193

Weight gain (kg)

40

30

20

10

0

30

40

50

60

70

80

90

100

Initial weight (kg)

Fig. 16.1. A scatterplot of weight gain during pregnancy versus weight at the start of pregnancy for 180 women.

other items of information. The results of the questionnaires were kindly made available by Dr. Mantle to a workshop on data analysis sponsored by the Royal Statistical Society.

In this example, as in many medical studies, there is a clearly defined response variable which is to be related to other explanatory variables. If the response variable is not obvious, then it is important to consider whether such a distinction among the variables can be made, because it does influence the focus of the analysis.

The initial phase of an analysis consists of simple tabulations or graphical presentations of the available data. For example, figure 16.1 is a scatterplot of weight gain in pregnancy versus weight at the start of pregnancy. The most obvious feature of this plot is that one woman has a recorded weight gain of almost 40 kg, about twice that of the woman with the next largest weight gain. This is somewhat suspicious and should be checked. Such extreme values can seriously influence estimation procedures. Also, two women are identified as

16 Data Analysis

194

Table 16.2. The responses to question 10 cross-tabulated by the responses to question 9 (see table 16.1)

Number of children

History of backache with previous pregnancies

 

by previous

 

 

 

 

 

0

1 { not

2 { no

3/4 {

pregnancies

 

 

applicable

 

mild/severe

 

 

 

 

 

 

 

 

 

0

20

79

1

1

61

6

4

32

37

 

 

 

 

 

 

having a zero weight gain; the weights at the start and end of pregnancy recorded on their questionnaires were identical. Although this is not impossible, such data should also be checked. Since we cannot verify the available data, these three individuals will be omitted from subsequent analyses.

Table 16.2 is a table of the responses to a question about a history of backache in previous pregnancies and the number of children by previous pregnancies. This highlights another problem with the quality of the available data. Although the response to the question concerning a history of backache in previous pregnancies was supposed to be coded 1, 2, 3 or 4, 26 women coded the value 0. Also, two women with no children by prior pregnancies have recorded codes concerning back pain, and four women with previous pregnancies have recorded responses labelled not applicable. If possible, these responses should also be checked, but here we are forced to make the ‘reasonable’ assumption that the not applicable and zero codes for women with previous pregnancies correspond to no previous backache (coded 2), and we recode the responses for all women with no previous pregnancies as 1 (not applicable). To shorten the discussion, we shall ignore the possibility of miscarriage, etc.

One aim of these preliminary tabulations, then, is to clean up the data set. This can be a time-consuming operation in a large data set, where the consistency of many variables needs to be checked. However, inconsistencies must be resolved, and this sort of activity is an important component of data analysis.

Table 16.3 is an expanded version of table 16.2 after the miscoded responses have been revised. This table indicates that the majority of the women have had no children, or at most one child, by a previous pregnancy. For the few women with more than one child by previous pregnancies, the degree of back pain does not depend strongly on the number of pregnancies. Therefore, without performing any formal statistical tests, we might conclude that there is little to be gained from a detailed study of the number of children by previous pregnancies. Initially, formal analysis procedures may therefore be restricted to considering the simple binary classification of parous and nulliparous women.

Initial or Exploratory Analysis

195

Table 16.3. The revised responses to question 10 cross-tabu- lated by the responses to question 9 (see tables 16.1, 16.2)

Number of

History of backache with previous pregnancies

children by

 

 

 

 

not

no

mild

severe

previous

applicable

 

 

 

pregnancies

 

 

 

 

 

 

 

 

 

 

 

 

0

101

0

0

0

1

0

28

19

5

2

0

6

3

3

3

0

3

4

0

4

0

3

0

0

5

0

1

2

0

6

0

0

0

1

7

0

1

0

0

 

 

 

 

 

We will not discuss any additional exploratory investigation of these data. Nevertheless, exploratory analysis is important, and we hope some appreciation for this aspect of data analysis has been conveyed.

16.4. Primary Analysis

In many medical studies, there are clearly defined questions of primary interest. In a clinical trial, for example, the treatment comparison is the main purpose of the trial; any additional information is of secondary importance, or has been collected to aid in making a valid treatment comparison. We will assume that, in the back pain example we have been discussing, the primary purpose was an initial study of the influence on backache of unalterable factors such as age and previous pregnancies. This suggests that adjustment may be required for other factors such as weight gain during pregnancy. Regression models are frequently a useful method of analysis in such a situation.

In this study, the response variable, back pain, is of a type we have not previously discussed in the context of regression models. It is discrete, with four categories, and is naturally ordered. Regression models for such data do exist, extending, in principle, the ideas of logistic regression. However, the primary purpose of the analysis may not require the use of such a specialized technique. More important yet, we should consider whether the data warrant a highly sophisticated treatment. Reaction to pain is likely to be very variable among individuals. Because of this, it may not be sensible to use a method of analysis

16 Data Analysis

196

Table 16.4. Current back pain severity versus a history of backache with previous pregnancies

History of backache

Current back pain severity

 

 

Total

with previous

 

 

 

 

 

none

little

troublesome

severe

 

pregnancies

 

 

 

 

 

 

 

 

 

 

 

 

Not applicable

8

56

28

9

101

None

5

14

14

9

42

Mild

0

7

13

8

28

Severe

0

3

5

1

9

 

 

 

 

 

 

Total

13

80

60

27

180

 

 

 

 

 

 

which places importance on the distinction between the back pain categories ‘nil’ and ‘nothing worth troubling about’. Similarly, the distinction between the upper two levels, ‘troublesome’ and ‘severe’, may not represent reliable data. For the primary purpose of the study, therefore, let us divide the back pain variable into two categories which represent the upper and lower two levels of response, assuming that this distinction will be realistic and meet the needs of the analysis. Logistic regression then becomes a natural choice for the method of analysis.

There are a variety of approaches to the use of logistic regression and the identification of important covariates which should be included in any model. Table 16.4 suggests that there is a relationship between a history of back pain in pregnancy and pain in the pregnancy under study. This variable would likely be included in a model. The inclusion of such variables in a regression model was previously discussed in chapter 15. In this case, to include the information represented by this categorical variable with four levels in the logistic regression model will require three binary covariates. Here, we will let the baseline category correspond to nulliparous women. The three binary covariates are then used to compare women in the three pain categories of none, mild and severe to the nulliparous group. The age of the patient is also a covariate of interest, and would be considered for inclusion in the model.

Table 16.5 presents the results of a logistic regression analysis which incorporates these two variables. The model indicates that the historical covariates are associated with current pain. Notice that the variable comparing parous women with no history of backache to nulliparous women is the least significant of the three historical comparison covariates and has a considerably smaller coefficient than the other two historical variables. If the coefficients for these three variables were comparable, we might suggest using a single vari-

Primary Analysis

197

Соседние файлы в папке Английские материалы