Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
CHAPTER 2_A5.doc
Скачиваний:
0
Добавлен:
01.03.2025
Размер:
432.13 Кб
Скачать

2.4. The Mann-Whitney test

Suppose two independent random samples are to be used to compare two populations. We may be unwilling to make assumptions about the form of the underlying population probability distributions or we may be unable to obtain exact values of the sample measurements. If the data can be ranked in order of magnitude for either of these situations, the Mann-Whitney test (sometimes called Mann-Whitney U test) can be used to test the hypothesis that the probability distributions associated with the two populations are equivalent.

Assume that apart from any possible differences in central location, that the two population distributions are identical. Suppose that observations are available from the first population and observations from the second population. The two samples are pooled and the observations are ranked in ascending order, with ties assigned the average of the next available ranks. Let denote the sum of the ranks from the first population. The Mann-Whitney statistic is

In testing the null hypothesis that the central locations of the two population distributions are the same, we assume that the two population distributions are identical. It can be shown that if the null hypothesis is true, the random variable U has mean

and variance

Then for large sample sizes (both at least 10), the distribution of the random variable,

is well approximated by the standard normal distribution.

Decision rules for the Mann-Whitney test

Suppose that two population distributions are identical, apart from any possible differences in central location. In testing the null hypothesis the two population distributions have the same central location, the following test have significance level :

Two population distributions have the same central location

1. If the alternative hypothesis is one sided hypothesis that the location of population 1 is higher than the location of population 2, the decision rule is

Reject if

2. If the alternative hypothesis is one sided hypothesis that the location of population 1 is lower than the location of population 2, the decision rule is

Reject if

3. If the alternative hypothesis is two sided hypothesis that the two population distributions differ, the decision rule is

Reject if or

Example:

Let us demonstrate the methodology of the Mann-Whitney test by using it conduct a test on the population of account balances at two branches of some Bank. Data collected from two independent simple random samples, one from each branch, are shown in Table 2.2.

Table2.2

Branch 1 Branch 2

Sampled Account Sampled Account

Account balance account balance

1 1 095 1 885

2 955 2 850

3 1 200 3 915

4 1 195 4 950

5 925 5 800

6 950 6 750

7 805 7 865

8 945 8 1 000

9 875 9 1 050

10 1 055 10 935

11 1 025

12 975

The first step in the Mann- Whitney test is to rank the combined (pooled) data from the two samples from low to high. Using the combined set of 22 observations shown in Table 2.2, the lowest value of $750(item 6 of sample2) is ranked number 1. Continuing the ranking, we have

Account balance Item Rank

750 6 of sample 2 1 800 5 of sample 2 2

805 7 of sample 1 3

…… ……………. …

1 195 4 of sample1 21

1 200 3 of sample 1 22

Item 6 of sample 1 and item 4 of sample 2 both have the same account balance, $950. We could give one of these items a rank 12 and the other a rank 13, but this could lead to an erroneous conclusion. In order to avoid this difficulty the usual treatment for tied data values is to assign each value the rank equal to the average of the ranks associated with the tied items. Thus the tied observations of $950 are both assigned ranks of 12.5. Table 2.3 shows the entire data set with the rank of each observation.

Table2.3

Branch 1 Branch 2

Sampled Account Sampled Account

Account balance Rank account balance Rank

1 1 095 20 1 885 7

2 955 14 2 850 4

3 1 200 22 3 915 8

4 1 195 21 4 950 12.5

5 925 9 5 800 2

6 950 12.5 6 750 1

7 805 3 7 865 5

8 945 11 8 1 000 16

9 875 6 9 1 050 18

10 1 055 19 10 935 10

11 1 025 17

12 975 15­­­­­­­­­­­­­­­­________________________________

Sum of ranks 169.5 83.5

The next step in the Mann-Whitney test is to sum the ranks for each sample. These sums are shown in Table 2.3. The test procedure can be based upon the sum of the ranks for either sample. In the following discussion we use the sum of the ranks for the sample from branch 1. We will denote this sum by . Thus, in our example .

The value observed for the Mann-Whitney test is

Since two samples are selected from identical populations and and each is 10 or greater, the sampling distribution of U can be approximated by a normal distribution with mean

and variance

Suppose that we want to test the null hypothesis that the central locations of the distributions of account balance are identical against the two-sided alternative for . The decision rule is to reject the null hypothesis if

or

Here

and

Since -2.08 is less than -1.96, we reject the null hypothesis that two population account balances are identical. Thus we conclude that two populations are not identical. The probability distribution of account balances at branch 1 is not the same as that at branch 2.

Now, from Table1 of the Appendix, the value of corresponding to a value (-2.08) is 0.0188, so the corresponding is 0.0376

The null hypothesis will be rejected for any significance level higher than 3.76%. Thus, these data do not contain strong evidence against the hypothesis that the central locations of accounts at two branches are the same. There is very strong support that two branches account balances are not identical.

Exercises

1. Starting salaries were recorded for ten recent business administration graduates at each of two well-known universities. Use and test for the difference in the starting salaries from the two universities is zero against the alternative that starting salaries are higher for the university A.

University A University B

Student Monthly salary ($) Student Monthly salary ($)

1 890 1 1 000

2 950 2 1 020

3 1 200 3 1 140

4 1 150 4 1 000

5 1 300 5 975

6 1 350 6 925

7 990 7 900

8 1 050 8 1 025

9 1 400 9 1 075

10 1 450 10 930

2. The following data show product weights for items produced on two production lines

Line 1: 13.6; 13.8; 14.0; 13.9; 13.4; 13.2; 13.3; 13.6; 12.9; 14.4

Line 2: 13.7; 14.1; 14.2; 14.0; 14.6; 13.5; 14.4; 14.8; 14.5; 14.3; 15.0; 14.9

Test that the difference between the product weights for the two lines is zero against the alternative that product weights of second line is higher.

Use . Also find p-value.

3. A random sample of 14 male students and an independent random sample of 16 female students were asked to write essays at the conclusion of a writing course. Their grades were recorded below:

Male: 75; 80; 60; 80; 95; 100; 65; 70; 75; 60; 50; 55; 90; 95

Female: 85; 70; 90; 100; 95; 67; 50; 50; 67; 83; 78; 62; 43; 97; 89; 73

Test the 5% significance level null hypothesis that, in the aggregate the male and female students are equally ranked, against a two-sided alternative. Also find p-value.

4. For a random sample of 12 management department gradates and 14 economics department graduates were asked their starting salaries. Those salaries were then ranked from 1 to 26. The following rankings resulted

Management: 2; 6; 7; 1; 11; 20; 8; 14; 21; 12; 4; 26

Economics: 13; 3; 17; 25; 5; 9; 10; 24; 15; 23; 16; 22; 18; 19

Analyze the data using the Mann-Whitney test, and comment on the results.

5. Starting salaries of graduates from two leading universities were compared. Independent random samples of 40 from each university were taken, and the 80 starting salaries were pooled and ranked. The sum of the ranks for students from one of these universities was 1450. Test the null hypothesis that the central locations of the population distributions are identical against two sided alternative.

6. A stock market analyst produced at the beginning of the year a list of stocks to buy and another list of stocks to sell. For a random sample of ten stocks from the “buy list”, percentage returns over the year were as follows:

10.6; 5.2; 12.8; 16.2; 10.6; 4.3; 3.1; 11.7; 13.9; 11.3

For an independent random sample of ten stocks from the “sell list”, percentage returns over the year were as follows:

-2.6; 6.1; 9.9; 11.3; 2.3; 3.9; -2.3; 1.3; 7.9; 10.8

For use the Mann-Whitney test to interpret these data. Also find and interpret p-value.

Answers

1. ; reject ;2. ; reject ; p-value = 0.3%;

3. ; accept ; 4. ; p- value =12.36%; will be rejected at all levels higher than 12.36%; 5. ; p-value = 0.101; will be rejected at any level higher than 10.1%; 6. ; reject at 5%;

p- value = 2.58%.

83

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]