Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Пособие_статистика.doc
Скачиваний:
331
Добавлен:
06.09.2019
Размер:
2 Mб
Скачать
  1. Read and translate the texts:

Before you begin to think about the three Ws – which statistical test, when to use it, and why – you need to know the meaning of five indispensable words: population, sample, parameter, statistic and variable. Briefly, a population is the complete set of cases that are of interest to you; a sample is the smaller part you have selected from the population to examine; a parameter is a numerical measure that describes some characteristic of a population; a statistic is a numerical measure that describes some characteristic of your sample; and a variable is any attribute, trait or characteristic of interest that varies in different circumstances or between cases.

Variables and data

Variables are the building blocks for the construction of your analysis. Once you have the cases of your sample, as far as statistics is concerned, you are going to produce a set of numbers related to whichever of their characteristics you are interested in. Samples are made up of individual cases, and these could be people, cars, months of the year, departments of an organisation, or whatever else you are interested in researching. Naturally, you will collect data about the cases in the sample for each of your variables – for example, colour, price, employment contract and so on. And, also naturally, the data for each individual case in the sample may differ on these variables – some may be red in colour and others may be blue, some may have a price of £1.99 and others £2.59, some may be on permanent employment contracts and others on fixed-term employment contracts.

As a result, one of the things that you need to look at when you examine the cases in your sample is how they vary among themselves on those variables for which you have collected data. Such variables will enable you to distinguish between one individual person or object from another, and to place them into categories. (Categories such as ‘motorcycle manufacturer’ or ‘gender of employee’ are called nominal variables (from the Latin, nominalis), because names are given to the different categories that the variable can take. More of this later.)

Variables can be classified using the type of data they contain (Figure 2.1). The most basic of these classifications divides variables into two groups on the basis of whether the data relate to categories or numbers:

  • Categorical, also termed qualitative, where the data are grouped into categories (sets) or placed in rank order.

  • Numerical, also termed quantitative, where the data are numbers, being either counts or measures.

Statistics textbooks usually develop this classification further, offering a hierarchy of measurements. In ascending order of numerical precision, variables may be:

  • Nominal, where the data are grouped into descriptive categories by name which, although they cannot be ranked, count the number of occurrences. Examples include gender (male, female) and department (marketing, human resources, sales,…).

  • Ordinal, where the relative position of each case within the data is known, giving a definite order and indicating where one case is ranked relative to another. Examples include social class (upper, middle, lower) and competition results (first, second, third, …).

  • Interval, where the difference between any two values in the data can be stated numerically, but not the relative difference. This is because the value zero does not represent none or nothing – or, as statisticians would say, it is not a ‘true zero’. Examples include temperature in Celsius (3°C, 4.5°C, 5°C, …) and time of day (00.34, 06.04, 19.59,…)

  • Ratio, where the relative difference, or ratio, between any two values in the data can be calculated. This is because the value zero represents none or nothing. Examples include number of customers (9, 10, 88, …), height of people (1.54 meters, 2.10 meters,…) and annual salary in euros (27,540, 38,000,…).

Finally, data can be classified in terms of the actual values that data for a specified variable can take:

  • Discrete, where the data can only take on certain values. Examples include number of customers and annual salary.

  • Continuous, where the data can take on any value (sometimes, as with height of people, within a finite range). Examples include height of people and temperature.