Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
4.18 Mб
Скачать

6

Doing it right first time – designing a study

Learning objectives

When you have finished this chapter you should be able to:

Explain what a sample is, and what the difference between study and target populations is.

Explain why it is important for a sample to be as representative of the population from which it is taken as possible.

Define a random sample, and explain what a sampling frame is.

Briefly outline what is meant by a contact sample, and by stratified and systematic samples.

Explain the difference between observational and experimental studies.

Explain the difference between matched and independent groups.

Briefly describe case-series, cross-section, cohort and case-control studies, and their limitations and advantages.

Explain the problem of confounding.

Medical Statistics from Scratch, Second Edition David Bowers

C 2008 John Wiley & Sons, Ltd

72

CH 6 DOING IT RIGHT FIRST TIME – DESIGNING A STUDY

Outline the general idea of the clinical trial.

Explain the concept of randomisation, and why it is important, and demonstrate that you can use a random number table to perform a simple block randomisation.

Describe the concept of blinding, and what it is intended to achieve.

Outline and compare the design of the parallel and cross-over randomised controlled trials, and summarise their respective advantages and shortcomings.

Explain what intention-to-treat means.

Be able to choose an appropriate study design to answer some given research question.

Hey ho! Hey ho! It’s off to work we go

There are two main threads here. First, the study design question, and second, the data collection question. Study design embraces issues like:

What is the research question? What are we hypothesising?

Which variables do we need to measure?

Which is our main outcome variable (the variable we are most interested in)?

How many subjects need to be included in the study?

Who exactly are the subjects? How should we select them?

How many groups do we need?

Are we going to make some form of clinical intervention or simply observe?

Do we need a comparison group?

At what stage are we going to take measurements? Before, during, after, etc.?

How long will the study take? And so on.

Study design is a systematic way of dealing with these issues, and offers a good-practice blueprint that is applicable in almost all research situations.

Second, the data collection question. Having decided an appropriate study design, we then have to consider the following:

How are we going collect the data from the subjects?

How do we ensure that the sample is as representative as possible?

I want to start with the data collection question. First, though, a brief mention of what we mean by a population.

HEY HO! HEY HO! IT’S OFF TO WORK WE GO

 

STUDY POPULATION

TARGET POPULATION

All low-birthweight

 

All low-birthweight babies

babies born in three

born in the UK in 2007.

maternity units in

 

Birmingham in 2007.

73

SAMPLE

The last 300 babies born in these three maternity units.

Figure 6.1 The target population, the study population and the sample

Samples and populations

In clinical research, we usually study a sample of individuals who are assumed to be representative of a wider group, to whom (with a good research design and appropriate sampling) the research might apply. This wider group is known as the target population, for example ‘all low-birthweight babies born in the UK in 2007’.

It would be impossible to study every single baby in such a large target population (or every member of any population). So instead, we might choose to take a sample from a (hopefully) more accessible group. For example, ‘all low-birthweight babies born in three maternity units in Birmingham in 2007’. This more restricted group is the study population. Suppose we take as our sample the last 300 babies born in these three maternity units. What we find out from this sample we hope will also be true of the study population, and ultimately of the target population. The degree to which this will be the case depends largely on the representativeness of our sample. These ideas are shown schematically in Figure 6.1. I’ll have more to say about this process in Chapter 7.

Exercise 6.1 Explain the differences between a target population, a study population and a sample. Explain, with an example, why it is almost never possible to study every member of a population.

Sampling error

Needless to say, samples are never perfect replicas of their populations, so when we draw a conclusion about a population based on a sample, there will always be what is known as sampling error. For example, if the percentage of women in the UK population with genital chlamydia is 3.50 per cent (we wouldn’t know this of course), and a sample produces a sample percentage of 2.90 per cent, then the difference between these two values, 0.60 per cent, is the sampling error. We can never completely eliminate sampling error, since this is an inherent feature of any sample.

74

CH 6 DOING IT RIGHT FIRST TIME – DESIGNING A STUDY

Collecting the data – types of sample

Now the data collection question. There are many books wholly dedicated to the various methods of collecting sample data. I am going to do little more than mention a couple of these methods by name. Those interested in more details of the methods referred to should consult other readily available sources.

The simple random sample and its offspring

The most important consideration is that any sample should be representative of the population from which it is taken. For example, if your population has equal numbers of male and female babies, but your sample consists of twice as many male babies as female, then any conclusions you draw are likely to be, at least, misleading. Generally, the most representative sample is a simple random sample. The only way that a simple random sample will differ from the population will be due to chance alone.

For a sample to be truly random, every member of the population must have an equal chance of being included in the sample. Unfortunately, this is rarely possible in practice, since this would require a complete and up-to-date list (name and contact details) of, for example, every lowbirthweight baby born in the UK in 2007. Such a list is called a sampling frame. In practice, compiling an accurate sampling frame for any population is hardly ever going to be feasible!

This same problem applies also to two close relatives of simple random sampling – systematic random sampling, and stratified random sampling. In the former, some fixed fraction of the sampling frame is selected, say every 10th or every 50th member, until a sample of the required size is obtained. Provided there are no hidden patterns in the sampling frame, this method will produce samples as representative as a random sample. In stratified sampling, the sampling frame is first broken down into strata relevant to the study, for example men and women; or nonsmokers, ex-smokers and smokers. Then each separate stratum is sampled using a systematic sampling approach, and finally these strata samples are combined. But both methods require a sampling frame.

Contact or consecutive samples

The need for an accurate sampling frame makes random sampling impractical in any realistic clinical setting. One common alternative is to take as a sample, individuals in current or recent contact with the clinical services, such as consecutive attendees at a clinic. For example, in the study of stress as a risk factor for breast cancer (Table 1.6), the researchers took as their sample 332 women attending a clinic at Leeds General Infirmary for a breast lump biopsy.

Alternatively, researchers may study a group of subjects in situ, for example on a ward, or in some other setting. In the nit lotion study (Table 2.1), researchers took as their sample all infested children from a number of Parisian primary schools, based on the high rates of infestation in those same schools the previous year.

If your sample is not a random sample, then the obvious question is, ‘How representative is it of the population?’ And, moreover, which population are we talking about here? In the breast cancer study, if the researchers were confident that their sample of 332 women was