Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Encyclopedia of SociologyVol._3

.pdf
Скачиваний:
16
Добавлен:
23.03.2015
Размер:
6.4 Mб
Скачать

NONPARAMETRIC STATISTICS

distribution theory was developed during this time period.

A number of texts have been published recently (see, e.g., Conover 1999; Daniel 1990; Hollander and Wolfe 1999; Krauth 1988; Neave and Worthington 1988; Siegel and Castellan 1988; Sprent 1989). Some of these texts can be used without an extensive statistical background; they have excellent bibliographies and provide adequate examples of assumptions, applications, scope, and limitations of the field of nonparametric statistics. In addition, the Encyclopedia of Statistical Sciences (Kotz and Johnson 1982–1989) and the International Encyclopedia of Statistics (Kruskal and Tanur 1978) should serve as excellent sources of reference material pertaining to nonparametric statistics.

The literature on nonparametric statistics is extensive. The bibliography published in 1962 by Savage had approximately 3,000 entries. More recent bibliographies have made substantial additions to that list.

TESTS AND TECHNIQUES

Nonparametric statistics may be divided into three major categories: (1) noninferential statistical measures; (2) inferential estimation techniques for point and interval estimation of parametric values of the population; and (3) hypothesis testing, which is considered the primary purpose of nonparametric statistics. (Estimation techniques included in the category above are often used as a first step in hypothesis testing.) These three categories include different types of problems dealing with location, dispersion, goodness-of fit, association, runs and randomness, regression, trends, and proportions. They are presented in Table 1 and illustrated briefly in the text.

Table 1, which includes a short list of some commonly used nonparametric statistical methods and techniques, is illustrative in nature. It is not intended to be an exhaustive list. The literature literally consists of scores of nonparametric tests. More exhaustive tables are available in the literature (e.g., Hollander and Wolfe 1999). The six columns in the table describe the nature of the sample, and the eight categories of rows identify the major types of problems addressed in

nonparametric statistics. Types of data used in nonparametric tests are not included in the table, though references to levels of data are made in the text. Tables that relate tests to different types of data levels are presented in some texts (e.g., Conover 1999). A different type of table provided by Bradley (1968) identifies the family to which the nonparametric derivations belong.

The first column in Table 1 consists of tests involving a single sample. The statistics in this category include both inferential and descriptive measurements. They would be used to decide whether a particular sample could have been drawn from a presumed population, or to calculate estimates, or to test the null hypothesis. The next column is for two independent samples. The independent samples may be randomly drawn from two populations, or randomly assigned to two treatments. In the case of two related samples, the statistical tests are intended to examine whether both samples are drawn from the same (or identical) populations. The case of k (three or more) independent samples and k related samples are extensions of the two sample cases.

The eight categories in the table identify the main focus of problems in nonparametric statistics and are briefly described later. Only selected tests and techniques are listed in table 1. Log linear analyses are not included in this table, although they deal with proportions and meet some criteria for nonparametric tests. The argument against their inclusion is that they are rather highly developed specialized techniques with some very specific properties.

It may be noted that: (1) many tests cross over into different types of problems (e.g. the chisquare test is included in three types of problems);

(2) the same probability distribution may be used for a variety of tests (e.g., in addition to association, proportion, and goodness-of-fit, the chi-square approximation may also be used in Friedman’s two-way analysis of variance and Kruskal-Wallis test); (3) many of the tests listed in the table are extensions or modifications of other tests (e.g., the original median test was later extended to three or more independent samples; e.g., the Jonckheere test); (4) the general assumptions and procedures that underlie some of these tests have been extended beyond their original scope (e.g. Hájek’s extension of the Kolmogorov-Smirnov test to regression

1958

NONPARAMETRIC STATISTICS

Selected Nonparametric Tests and Techniques

 

 

 

TYPE OF DATA

 

 

 

 

 

Two Related,

 

 

 

 

Two

Paired, or

k

k

Type of

 

Independent

Matched

Independent

Related

Problem

One Sample

Samples

Samples

Samples

Samples

Location

Sign test

Mann-Whitney-

Sign test

Extension of

 

 

Wilcoxon rank-

 

Brown-Mood

 

Wilcoxon

sum test

Wilcoxon matched-

median test

 

signed ranks

 

pairs signed

 

 

test

Permutation test

rank test

Kruskal-Wallis

 

 

 

 

one-way analysis

 

 

Fisher tests

Confidence

of variance test

 

 

 

interval based

 

 

 

Fisher-Pitman test

on sign test

Jonckheer test

 

 

 

 

for ordered

 

 

Terry Hoeffding

Confidence

alternatives

 

 

and van der

interval based

 

 

 

Waerden/normal

on the Wilcoxon

Multiple

 

 

scores tests

matched-pairs

comparisons

 

 

Tukey’s confidence

signed-ranks test

 

 

 

 

 

 

 

interval

 

 

Extension of Brown-Mood median test

Kruskal-Wallis one-way analysis of variance test

Jonckheer test for ordered alternatives

Multiple comparisons

Friedman two-way analysis of variance

Dispersion

 

Siegel-Tukey test

 

 

 

(Scale

 

 

 

 

 

Problems)

 

Moses’s ranklike tests

 

 

 

 

 

Normal scores tests

 

 

 

 

 

Test of the Freund,

 

 

 

 

 

Ansari-Bradley,

 

 

 

 

 

David, or Barton type

 

 

 

 

 

 

 

 

 

Goodness-of-fit

Chi-square

Chi-square test

 

Chi-square test

 

 

goodness-of-fit

Kolmogorov-

 

Kolmogorov-

 

 

 

 

 

 

Kolmogorov-

Smirnov test

 

Smirnov test

 

 

Smirnov test

 

 

 

 

 

Lilliefors test

 

 

 

 

Association

Spearman’s

Chi-square test of

Spearman rank

Chi-square test of

Kendall’s coefficient

 

rank correlation

independence

correlation

independence

of concordance

 

Kendall’s taua

 

coefficient

Kendall’s Partial

 

 

 

 

 

 

taub tauc

 

Kendall’s taua

rank correlations

 

 

Olmstead-Tukey

 

taub tauc

Kendall’s coefficient

 

 

 

 

 

 

test

 

Olmstead-Tukey

of agreement

 

 

Phi coefficient

 

corner test

Kendall’s coefficient

 

 

 

 

 

 

Yule coefficient

 

 

of concordance

 

 

 

 

 

 

 

Goodman-Kruskal

 

 

 

 

 

coefficients

 

 

 

 

 

Cramer’s statistic

 

 

 

 

 

Point biserial

 

 

 

 

 

coefficient

 

 

 

(continued)

1959

NONPARAMETRIC STATISTICS

Selected Nonparametric Tests and Techniques (continued)

 

 

TYPE OF DATA

 

 

 

 

 

Two Related,

 

 

 

 

Two

Paired, or

k

k

Type of

 

Independent

Matched

Independent

Related

Problem

One Sample

Samples

Samples

Samples

Samples

Runs and

Runs test

Wald-Wolfowitz

 

 

 

Randomness

Runs above

runs test

 

 

 

 

 

 

 

 

 

and below the

 

 

 

 

 

median

 

 

 

 

 

Runs up-and-

 

 

 

 

 

down test

 

 

 

 

Regression

 

Hollander and

 

Brown-Mood test

 

 

 

Wolfe test for

 

 

 

 

 

parallelism

 

 

 

 

 

Confidence interval

 

 

 

 

 

for difference

 

 

 

 

 

between two

 

 

 

 

 

slopes

 

 

 

Trends and

Cox-Stuart test

 

McNemar Change

 

Changes

Kendall’s tau

 

test

 

 

 

 

 

 

Spearman’s rank

 

 

 

 

correlation coefficent

 

 

 

 

McNemar change

 

 

 

 

test

 

 

 

 

Runs up-and-down

 

 

 

 

test

 

 

 

Proportion and

Binomial test

Fisher’s exact

Chi-square test

Cochran’s Q

Ratios

 

test

test of

test

 

 

Chi-square test

homogeneity

 

 

 

 

 

 

 

of homogeneity

 

 

Table 1

analysis and extension of the two-sample Wilcoxon test for testing the parallelism between two linear regression slopes); (5) many of these tests have corresponding techniques of confidence interval estimates, only a few of which are listed in Table 1;

(6) many tests have other equivalent or alternative tests (e.g., when only two samples are used, the Kruskal-Wallis test is equivalent to the MannWhitney test); (7) sometimes similar tests are lumped together in spite of differences as in the case of the Mann-Whitney-Wilcoxon test or the Ansari-Bradley type tests or multiple comparison tests; (8) some tests can be used with one or more samples in which case the tests are listed in one or

more categories, depending on common usage;

(9) most of these tests have analogous parametric tests; and (10) a very large majority of nonparametric tests and techniques are not included in the table.

Only a few of the commonly used tests and techniques are selected from Table 1 for illustrative purposes in the sections below. The assumptions listed for the tests are not meant to be exhaustive, and hypothetical data are used in order to simplify the computational examples. Discussions about the strengths and weaknesses of these tests is also omitted. Most of the illustrations are either two-tailed or two-sided hypotheses at the

1960

NONPARAMETRIC STATISTICS

0.05 level. Tables of critical values for the tests illustrated here are included in most statistical texts. Modified formulas for ties are not emphasized, nor are measures of estimates illustrated. Generally, only simplified formulas are presented. A very brief description of the eight major categories of problems follows.

Location. Making inferences about location of parameters has been a major concern in the field of statistics. In addition to the mean, which is a parameter of great importance in the field of inferential statistics, the median is a parameter of great importance in nonparametric statistics because of its robustness. The robust quality of the median can be easily ascertained. If the values in a sample of five observations are 5, 7, 9, 11, 13, both the mean and the median are 9. If two observations are added to the sample, 1 and 94 (an outlier), the median is still 9, but the mean is changed to 20. Typical location problems include estimating the median, determining confidence intervals for the median, and testing whether two samples have equal medians.

Sign Test This is the earliest known nonparametric test used. It is also one of the easiest to understand intuitively because the test statistic is based on the number of positive or negative differences or signs from the hypothesized median. A binomial probability test can be applied to a sign test because of the dichotomous nature of outcomes that are specified by a plus (+) which indicates a difference in one direction or a minus (−) sign which indicates a difference in another direction. Observations with no change or no difference are eliminated from the analysis. The sign test may be a one-tailed or a two-tailed test. A sign test may be used whenever a t-test is inappropriate because the actual values may be missing or not known, but the direction of change can be determined, as in the case of a therapist who believes that her client is improving. The sign test only uses the direction of change and not the magnitude of differences in the data.

Wilcoxon Matched-Pairs Signed-Rank Test The sign test analysis includes only the positive or negative direction of difference between two measures; the Wilcoxon matched-pairs signed-rank test will also take into account the magnitude of differences in ordering the data.

Example: A matched sample of students in a school were enrolled in diving classes with different training techniques. Is there a difference? The scores are listed in Table 2.

Illustrative Assumptions: (1) The random sample data consist of pairs; (2) the differences in pair values have an ordered metric or interval scale, are continuous, and independent of one another; and

(3) the distribution of differences is symmetric.

Hypotheses: A two-sided test is used in this example.

H0: Sum of positive ranks = sum of negative

ranks in population

(1) H1: Sum of positive ranks ≠ sum of negative

ranks in population

Test statistic or procedures: The differences between the pairs of observations are obtained and ranked by magnitude. T is the smaller of the sum of ranks with positive or negative signs. Ties may be either eliminated or the average value of the ranks assigned to them. The decision is based on the value of T for a specified N. Z can be used as an approximation even with a small N except in cases with a relatively large number of ties. The formula for Z may be substituted when N > 25.

Z =

T N (N + 1) / 4

 

(2)

 

 

N (N + 1)(2N + 1) / 24

This formula is not applicable to the data in Table 2 because the N is < 25 and the calculations in Table 2 will be used in deciding whether to reject or fail to reject (‘‘accept’’) the null hypothesis. In this example in Table 2, the N is 7 and the value of the smaller T is 9.5.

Decision: The researchers fail to reject the null hypothesis (or ‘‘accept’’ the null hypothesis) of no difference between the two groups, with an N of 7 at the 0.05 level, for a two-sided test, concluding that there is no statistically significant difference in the two types of training at the 0.05 level.

Efficiency: The asymptomatic related efficiency of the test varies around 95 percent, based on the sample sizes.

1961

NONPARAMETRIC STATISTICS

Total Scores for Five Diving Trials

 

X

Y

Y - X

Signed Rank of

Negative

Pairs

Team A

Team B

Differences

Differences T+

Ranks T

1

37

35

-2

-1

1

 

2

39

46

7

+4

 

 

 

3

32

24

-8

-5.5

5.5

 

4

21

34

13

+7

 

 

 

5

20

28

8

+5.5

 

 

 

6

9

12

3

+2

 

 

 

7

14

9

-5

-3

3

 

 

 

 

 

 

 

 

 

 

 

T+ = 18.5, T= 9.5

 

 

9.5

 

 

 

 

Table 2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Related parametric test: The t-test for matched pairs.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

k

 

 

2

 

 

 

 

 

 

Analogous nonparametric tests: Sign test; ran-

H =

 

12

 

 

 

R i

 

− 3(N + 1)

 

 

(4)

 

 

 

 

 

 

 

 

 

 

 

 

 

N

(N +

1) i− 1

 

N i

 

 

 

 

 

 

domization test for matched pairs; Walsh test

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

for pairs.

where N1 = the case in the ith category of rank sums

Kruskal-Wallis One-Way Analysis of Variance Test

Ri = the sum of ranks in the ith sample.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This is a location measure with three or more

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

12

 

 

2

 

2

 

(29)

2

 

 

independent samples. It is a one-way analysis of

H =

 

 

(46)

 

+

(16)

+

 

 

 

 

13(13 + 1)

 

 

 

4

 

 

 

 

variance that utilizes ranking procedures.

 

 

5

 

 

 

 

4

(5)

 

 

− 3(13 + 1) = 45, 9857 − 42

 

 

 

 

 

 

 

 

 

Example: The weight loss in kilograms for 13

H = 3.99

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

randomly assigned patients to one of the three diet

Decision: Do not reject the null hypothesis, as

programs is listed in Table 3 along with the rankings.

the chi-square value for 2 df at the 0.05 level is 5.99

Is there a significant difference in the sample

and the H value of

 

3.99 is

 

less

than

the criti-

medians?

 

 

cal value.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Illustrative Assumptions: (1) Ordinal data; (2)

Efficiency: Asymptotic relative efficiency of

three or more random samples; and (3) indepen-

Kruskal-Wallis test to F test is 0.955 if the popula-

dent observations.

tion is normally distributed.

 

 

 

 

 

 

 

 

Hypotheses: A two-sided test without ties is used

Related parametric test: F test. Analogous

in this example.

nonparametric test(s): Jonckheere test for ordered

H0 : Md1 = Md2 = Md3. The populations

alternatives.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Friedman Two-Way Analysis of Variance This is a

have the same median values.

nonparametric two-way analysis of variance based

(3)

H1 : Md1 Md2 Md3 All the populations

on ranks and is a good substitute for the paramet-

do not have the same median value.

ric F test when the assumptions for the F test

cannot be met.

Test statistics or procedures: The procedure is to rank the values and compute the sums of those ranks for each group and calculate the H statistic. The formula for H is as follows:

Example: Three groups of telephone employees from each of the work shifts were tested for their ability to recall fifteen-digit random numbers, under four conditions or treatments of sleep

1962

NONPARAMETRIC STATISTICS

Diet Programs and Weight-Loss Rankings

 

Group 1

 

Rank

Group 2

Rank

Group 3

Rank

 

2.8

 

 

3

2.2

 

1

2.9

4

 

3.5

 

 

7

2.7

 

2

3.1

6

 

4.0

 

 

11

3.0

 

5

3.7

9

 

4.1

 

 

12

3.6

 

8

3.8

10

 

4.9

 

 

13

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

R1 = 46

 

 

R2 = 16

 

R3 = 29

Table 3

 

 

 

 

 

 

 

 

 

 

deprivation. The observations and rankings are

Decision: The critical value at the 0.05 level of

listed in Tables 4 and 5. Is there a difference in the

significance in this case for N=3 and k=4 is 7.4.

population medians?

 

 

 

Reject the null hypothesis because the F value is

 

 

 

 

 

 

 

 

higher than the critical value. Conclude that the

 

 

 

 

 

 

 

 

12

k

 

 

ability to recall is affected.

 

F

=

 

R 2 − 3N

(k + 1)

 

 

τ

Nk

(k + 1)

j=1

j

 

 

Efficiency: The asymptotic relative efficiency of

N = number of rows (subjects)

 

 

this test depends on the nature of the underlying

 

 

 

 

 

 

(6)

 

k = number of columns (variables or

 

population distribution. With k=2 (number of sam-

 

conditions or treatments)

 

ples), the asymptotic relative efficiency is reported

R j

= sum of ranks in the jth column

 

 

to be 0.637 relative to the t test and is higher in

cases of larger number of samples. In the case of

 

 

12

 

 

 

F =

 

 

[(11)2+ (6)2+ (3)2+ (10)2]

 

 

 

 

τ

3(4)(4 + 1)

 

(7)

 

 

 

− 3(3)(4 + 1)

 

 

 

 

 

 

 

 

 

Fτ

= (0.20)(266) − 45 = 8.2

(8)

Illustrative Assumptions: (1) There is no interaction between blocks and treatment; and (2) ordinal data with observable magnitude or interval data are needed.

Hypotheses:

H0: Md1 = Md2 = Md3 = Md4. The different levels of sleep deprivation do not have differential effects.

(9)

H1: One or more equality is violated. The different levels of sleep deprivation have differential effects.

Test statistic or procedures: The formula and computations are listed above.

three samples, for example, the asymptotic relative efficiency increases to 0.648 relative to the F test, and in the case of nine samples it is at least 0.777.

Related parametric test: F test.

Analogous nonparametric tests: Page test for ordered alternatives.

Mann-Whitney-Wilcoxon Test A combination of different procedures is used to calculate the probability of two independent samples being drawn from the same population or two populations with equal means. This group of tests is analogous to the t-test, it uses rank sums, and it can be used with fewer assumptions.

Example: Table 6 lists the verbal ability scores for a group of boys and a group of girls who are less than 1 year old. (The scores are arranged in ascending order for each of the groups.) Do the data provide evidence for significant differences in verbal ability of boys and girls?

1963

NONPARAMETRIC STATISTICS

Scores of Three Groups by Four Levels of

Rank of Three Groups by Four Levels of

 

 

Sleep Deprivation

 

 

 

 

Sleep Deprivation

 

 

 

Conditions

I

 

II

 

 

III

IV

Ranks

 

I

 

II

 

III

IV

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Group 1

 

7

 

4

 

 

2

6

 

 

Group 1

 

4

 

2

 

1

 

 

3

 

 

Group 2

 

6

 

4

 

 

2

9

 

 

Group 2

 

3

 

2

 

1

 

 

4

 

 

Group 3

 

10

 

3

 

 

2

7

 

 

Group 3

 

4

 

2

 

1

 

 

3

 

 

Table 4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Rj

 

11

 

6

 

3

 

 

10

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 5

 

 

 

 

 

 

 

 

 

 

 

 

 

Illustrative Assumptions: (1) Samples are inde-

 

 

 

 

 

 

 

 

 

 

 

 

 

 

pendent and (2) ordinal data.

 

 

 

 

for U in this case is 4 or smaller, for sample sizes of

 

 

 

 

 

 

 

 

 

 

 

 

Hypotheses: A two-sided test is used in this

4 and 9 respectively.

 

 

 

 

 

 

 

 

 

 

example.

 

 

 

 

 

 

 

 

 

 

Efficiency: For large samples, the asymptomatic

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

relative efficiency approaches 95 percent.

H0: Md1 = Md2. There are no significant

 

 

 

Related Parametric Test: F test.

 

 

 

 

differences in the verbal ability of

 

 

 

 

 

boys and girls.

 

 

 

 

 

(10)

 

 

Analogous Nonparametric Tests: Behrens-Fisher

H1:

Md1 Md2. There is a significant

 

 

 

 

 

problem test, robust rank-order test.

 

 

 

 

difference in the verbal ability of

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

boys and girls.

 

 

 

 

 

 

 

 

Z can be used as a normal approximation if N >

Test statistic or procedures: Rearrange all the

12, or N1, or N2 > 10, and the formula is giv-

en below.

 

 

 

 

 

 

 

 

 

 

 

 

 

scores in an ascending or descending order (see

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 7). The test statistics are U1 and U2 and the

Z =

R

1 R 2(N

1 N

2)(N + 1) /2

 

(14)

 

calculations are illustrated below.

 

 

 

 

 

N 1N 2(N + 1) / 3

 

Mann-Whitney Wilcoxon U Test The following

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

formulas may be used to calculate U.

 

 

 

Dispersion. Dispersion refers to spread or

 

 

 

 

 

 

 

 

 

 

 

 

variability. Dispersion measures are intended to

 

 

 

 

 

 

 

 

 

 

 

 

 

U 1

= N 1N 2

+

N

1(N 1

+ 1)

 

R 2

(11)

 

 

test for equality of dispersion in two populations.

 

 

2

 

 

 

 

The two-tailed null hypothesis in the Ansari-

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Bradley-type tests and Moses-type tests assumes

 

 

 

 

 

2(N 2

+ 1)

 

 

 

 

 

U 2

= N 1N 2

+

N

R 2

(12)

 

 

that there are no differences in the dispersion of

 

 

2

 

 

 

 

the populations. The Ansari-Bradley test assumes

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

equal medians in the population. The Moses test

 

 

 

 

 

 

 

 

 

 

 

 

has wider applicability because it does not make

R 1 = 1 + 4 + 5 + 9 = 19

 

 

 

 

 

 

 

 

 

 

 

 

that assumption.

 

 

 

 

 

 

 

 

 

 

R 2

= 2 + 3 + 6 + 7 + 8 + 10.5 + 10.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Dispersion tests are not widely used because

 

+ 12 + 13 = 72

 

 

 

 

 

 

 

 

R 1 and R 2 refer to the sum of ranks for (13)

 

of the limitations on the tests imposed by the

 

group 1 and group 2, respectively.

 

assumptions and the low asymptotic related effi-

U 2 = (4)(9) = [9(9 + 1) / 2] − 72 = 9 and

 

 

 

ciency of the tests, or both.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

U 1 = 27, for U 1 + U 2 = (N 1N 2) = 36

 

Goodness-of-Fit. A goodness-of-fit test is used

Decision: Retain null hypothesis. At the 0.05

to test different types of problems—for example,

the likelihood of observed sample data’s being

level, we fail to reject the null hypotheses of no

drawn from a prespecified population distribu-

differences in verbal ability. The rejection region

tion, or comparisons of two independent samples

1964

 

 

 

NONPARAMETRIC STATISTICS

 

 

 

 

 

 

 

 

Verbal Scores for Boys and Girls Less than 1 Year Old

 

Boys N1 (sample A):

10

15

18

28

 

 

 

 

 

Girls N2 (sample B):

12

14

20

22

25

30

30

31

32

Table 6

being drawn from populations with a similar distribution. The first problem mentioned above is illustrated here using the chi-square goodness-of- fit procedures.

Χ2, or the chi-square test, is among the most widely used nonparametric tests in the social sciences. The four major types of analyses conducted through the use of chi-square are: (1) goodness-of- fit tests, (2) tests of homogeneity, (3) tests for differences in probability, and (4) test of independence. Of the four types of tests, the last one is the most widely used. The goodness-of-fit test and the test of independence will be illustrated in this article because the assumptions, formulas, and testing procedures are very similar to one another. The Χ2 test for independence is presented in the section on measures of association.

Goodness-of-fit tests would be used in making decisions based on the prior knowledge of the population; for example, sentence length in a new manuscript could be compared with other works of an author to decide whether the manuscript is by the same author; or a manager’s observation of a greater number of accidents in the factory on some days of the week as compared to the average figures could be tested for significant differences. The expected frequency of accidents given in table 8 below is based on the assumption of no differences in the number of accidents by days of the week.

Illustrative Assumptions: (1) The data are nominal or of a higher order such as ordinal, categorical, interval or ratio data. (2) The data are collected from a random sample.

Hypothesis:

Test Statistic or Procedures: The formula for calculating this is the same as for the chi-square test of independence. A short-cut formula is also provided and is used in this illustration:

H0 : The distribution of accidents

during the week is uniform.

(15)

H1 : The distribution of accidents during the week is not uniform.

χ 2 = Σ ( f o f e )2

 

fe

(16)

χ 2 = Σ ( f o)2 N

 

fe

 

The notation f0 refers to the frequency of actual observations and fe is the frequency of expected observations.

χ 2 =

225

+

900

+ .... +

1600

 

 

 

 

 

30

30

30

(17)

 

 

 

 

 

 

+2025 N = 230 − 210 = 20

30

Decision: With seven observations, there are six degrees of freedom. The value for χν2 is 12.59 at the .05 level of significance. Therefore, the null hypothesis of equal distribution of accidents over the 7 days is rejected at the .05 level of significance.

Asymptotic Relative Efficiency: There is no discussion in the literature about this because nominal data can be used in this analysis and the test is often used when there are no alternatives available. Asymptotic relative efficiency is meaningless with nominal data.

Related Parametric Test. t test.

Analogous Nonparametric Tests. The KolmogrovSmirnov one-sample test, and the binomial test for dichotomous variables.

The Kolmogrov-Smirnov test is another major goodness-of-fit test. It has two versions, the onesample and the two-sample tests. It is different from the chi-square goodness-of-fit in that the

1965

NONPARAMETRIC STATISTICS

Ranked Verbal Scores for Boys and Girls Less than 1 Year Old

Scores:

10

12

14

15

18

20

22

25

28

30

30

31

32

Rank:

1

2

3

4

5

6

7

8

9

10.5

10.5

12

13

Comp:

A

B

B

A

A

B

B

B

A

B

B

B

B

Table 7

Kolmogrov-Smirnov test, which is based on observed and expected differences in cumulative distribution functions and can be used with individual values instead of having to group them.

Association. There are two major types of measures of association. They consist of: (1) measures to test the existence (relationship) or nonexistence (independence) of association among the variables, and (2) measures of the degree or strength of association among the variables. Different tests of association are utilized in the analysis of nominal and nominal data, nominal and ordinal data, nominal and interval data, ordinal and ordinal data, and ordinal and interval data.

Chi-Square Test of Independence In addition to goodness-of-fit, χ2 can also be used as a test of independence between two variables. The test can be used with nominal data and may consist of one or more samples.

Example: A large firm employs both married and single women. The manager suspects that there is a difference in the absenteeism rates between the two groups. How would you test for it? Data are included in Table 9.

Illustrative Assumptions: (1) The data are nominal or of a higher order such as ordinal, categorical, interval, or ratio data. (2) The data are collected from a random sample.

Hypothesis:

Test statistic or procedures: The formula for χ2 is given below. Differences between observed and expected frequencies are calculated, and the resultant value is indicated below.

The expected frequencies are obtained by multiplying the corresponding column marginal totals by row marginal totals for each cell divided by the total number of observations. For example, the expected frequency for the cell with an observed frequency of 40 is (100 × 100)/400=25.

H0: The two variables are independent

 

or there is no difference between

 

 

married and single women with

 

 

respect to absenteeism.

 

H1:

The two variables are not indepen-

(18)

 

dent (i.e., they are related), or there is no difference between married women and single women with respect to absenteeism.

χ 2 = Σ ( fo fe )2

fe

(19)

fo − observed frequency, fe − expected frequency

Similarly, the expected frequency for the cell with an observed frequency of 170 is (200 × 300)/ 400=150.

(30 − 25)2/ 25 + (70 − 75)2/ 75 + (40 − 25)2/

25 + (60 − 75)2/ 75 + (30 − 50)2/

(20)

50 + (170 − 150)2/ 150

 

χ 2 = 1 + .33 + 9 + 3 + 8 + 2.67 = 24

 

 

 

df = (number of rows − 1) × (number of

(21)

columns − 1) = (3 − 1)(2 − 1) = 2

 

 

 

Decision: As the critical χ2 value with two df is 5.99, we reject the null hypothesis, at the 0.05 level. We accept the alternate hypothesis of the existence of a statistically significant difference in the ratio of absenteeism per year between the two groups of married and single women.

Efficiency: The asymptotic relative efficiency of a χ2 test is hard to assess because it is affected by the number of cells in the contingency table and the sample size as well. The asymptotic related efficiency of a 2 × 2 contingency table is very low,

1966

NONPARAMETRIC STATISTICS

Frequency of Traffic Accidents for One Week during May

Day

S

M

T

W

T

F

S

Total

Traffic Accidents

15

30

30

25

25

40

45

210

Expected Frequencies

30

30

30

30

30

30

30

210

Table 8

but the power distribution of χ2 starts approximating closer to 1 as the sample size starts getting larger. However, a large number of cells in a χ2 table, especially with a combination of large sample sizes, tend to yield large χ2 values which are statistically significant because of the size of the sample. In the past, Yate’s correction for continuity was often used in a 2 × 2 contingency table if the cell frequencies were small. Because of the criticism of this procedure, this correction procedure is no longer widely used. Other tests such as Fisher’s Exact Test can be used in cases of small cell frequencies.

Related Parametric Test: There are no clear-cut related parametric tests because the χ2 test can be used with nominal data.

Analogous Nonparametric Tests: The Fisher Exact Test (limited to 2 × 2 tables and small tables) and the median test (limited to central tendencies) can be used as alternatives. In addition, a large number of tests such as phi, gamma, and Cramer’s V statistic, can be used as alternatives, provided the data characteristics meet the assumptions of these tests. The χ2 distribution is used in many other nonparametric tests.

The chi-square tests of contingency tables allow partitioning of tables, combining tables, and using more than two-way tables with control variables.

The second type of association tests measure the actual strength of association. Some of these tests also indicate the direction of the relationship and the test values in most cases extend from −1.00 to +1.00 indicating a negative or a positive relationship. The values of some other nondirectional tests fall between 0.00 and 1.00. Contingency table formats are commonly used to measure this type of association. Among the more widely used tests are the following, arranged by the types of data used: Nominal by Nominal Data:

Phi coefficient—limited to a 2 × 2 contingency table. A square of these test values is used to interpret a proportional reduction error.

Contingency coefficient based on the chisquare values. The lowest limit for this test is 0.00, but the upper limit does not attain unity (value of 1.00).

Cramer’s V statistic—not affected by an increase in the size of cells as long as it is related to similar changes in the other cells.

Lambda—the range of lambda is from 0.00 to 1.00, and thus it has only positive values.

Ordinal by Ordinal Data:

Gamma—uses ordinal data for two or more variables. Test values are between −1.00 and +1.00.

Somer’s D—used for predicting a dependent variable from the independent variable.

Kendall’s tau—described in more detail below.

Spearman’s rho—described in more detail below.

Categorical by Interval Data

Kappa—The table for this test needs to have the same categories in the columns and the rows. Kappa is a measure of agreement, for example, between two judges.

The tests described above are intended for two-dimensional contingency tables. Tests for threedimensional tables have been developed recently in both parametric and nonparametric statistics.

Two other major measures of association referenced above are presented below. They are Kendall’s τ (the forerunner of this test is also one

1967

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]