Добавил:
kiopkiopkiop18@yandex.ru t.me/Prokururor I Вовсе не секретарь, но почту проверяю Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
0
Добавлен:
28.03.2026
Размер:
4.18 Mб
Скачать

7

From samples to populations – making inferences

Learning objectives

When you have finished this chapter you should be able to:

Show that you understand the difference, and the connection, between a population parameter and a sample statistic.

Explain what statistical inference is.

Explain what an estimate is and why this is unlikely to be exactly the same as the population parameter being estimated.

Statistical inference

You saw in the previous chapter, that when we want to discover things that interest us about a population, we take a sample. We then hope to generalise our sample findings, first to the study population and ultimately to the target population. Statisticians call this process, of generalising from a sample to a population, statistical inference or inferential statistics.

To take an example (Grun et al. 1997): researchers were interested in comparing two methods of screening for genital chlamydia in women attending general practice. Their target population was, ‘all asymptomatic women attending general practice’.1 Their study population was four

1 They don’t say whether this is all such women in London, or England, or Wales, or the UK!

Medical Statistics from Scratch, Second Edition David Bowers

C 2008 John Wiley & Sons, Ltd

94

CH 7 FROM SAMPLES TO POPULATIONS – MAKING INFERENCES

TARGET POPULATION All asymptotic women aged 18-35 attending general practice;

about 2.6 % with genital chlamydia??

STUDY POPULATION 3960 women aged 18-35 in four general practices in Camden and Islington;

about 2.6 % with genital chlamydia?

SAMPLE n = 765; 2.6 % with genital chlamydia

Figure 7.1 The process of statistical inference – from sample to population

general practices in the London Boroughs of Camden and Islington, with a total of 37 000 patients. All women aged between 18 and 35 were invited to take part in the study. A total study population of 3960 women were eligible for inclusion. After exclusions for various reasons, a total sample of 765 women were finally included. As well as the results of their cervical smear for genital chlamydia, data from a brief questionnaire on demographic details, history of urogenital problems and information on sexual history, was also included in the sample data.

The prevalence of genital chlamydia in the sample was found to be 2.6 per cent. The authors might then have inferred from this sample result that the prevalence of genital chlamydia in the study population of 3960 women in the four practices, was also about 2.6 per cent. And by extension, was also true of the target population of all asymptotic women attending general practice.

The accuracy of this estimate would depend on how typical the 765 women in the sample were of all the 3960 women in the study population, and in turn how typical these women were of all the women in the target population – all women 18–35 in the UK attending GP practice. This particular statistical inference process is illustrated in Figure 7.1.

I have used the word ‘estimate’ 2 here deliberately, because the value you get from your sample (from any sample) is never going to be exactly the same as the population value. You have to accept that the percentage with genital chlamydia in the population is probably around 2.6 per cent, give or take a bit. The size of the ‘bit’ depends on how similar your sample is to its population – and on sampling error. I’ll have a lot more to say on this later in the book.

For the moment, the meaning of a few terms. The feature or characteristic of a population whose value you want to determine is known as a population parameter. For example, the mean or the median of some variable in a population are both population parameters. In the genital chlamydia example, the population parameter you want to estimate is the percentage with genital chlamydia.

The value that you get from your sample, in this case the sample percentage with genital chlamydia (on which you are going to base your estimate of the population value) is called the sample statistic. This is why we are so interested in the summary descriptive measures, such as the sample mean and the sample median, described in Chapter 6. In other words, you can use the sample mean, for example, to estimate the population mean, the sample median to estimate the population median and so on.

2 An estimate is just a fancy word for an informed guess.

STATISTICAL INFERENCE

95

Actually, estimation is not the only way of making inferences about population parameter values. An alternative approach is to hypothesise that a population parameter has a particular value, and then see if the value of the corresponding sample statistic is compatible with your hypothesis. This approach is called hypothesis testing. In Chapters 9 to 11, I am going to discuss some common estimation procedures and in Chapters 12 to 14, I will discuss the alternative hypothesis test approach. First, however, I need to say a few words on probability, and some other related stuff; this I will do in the next chapter.

Exercise 7.1 (a) Explain the meaning of and the difference between a population parameter and a sample statistic. (b) Why is a sample, however well chosen, never going to be exactly representative of the sampled population? (c) Give a couple of examples that illustrate the difference between a target and a study population?

Exercise 7.2 Give a few reasons why women aged 18–35 in the London boroughs of Camden and Islington may not be typical of all women in London, or of all women in the UK.