Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Словари и журналы / Психологические журналы / p93British Journal of Mathematical and Statistical Psycholog

.pdf
Скачиваний:
48
Добавлен:
29.05.2015
Размер:
265.61 Кб
Скачать

93

British Journal of Mathematical and Statistical Psychology (2002), 55, 93–107

© 2002 The British Psychological Society

www.bps.org.uk

Sampling variability of the success ratio in predictor-based selection

Wilfried De Corte

Department of Data Analysis, Faculty of Psychology, Ghent University, Belgium

In applied settings, the expected beneŽ ts of a predictor-based selection are often expressed in terms of the expected success ratio of the selection. Although Taylor and Russell proposed a formula to estimate its value, simple simulations show that their formula is often inadequate. Also, little is known at present about the sampling variability of the success ratio and, hence, about the accuracy with which the success ratio of any particular selection can be predicted. The above deŽ ciencies are addressed for the three most popular single-stage selection scenarios: the restricted (or Ž xed) quota, the threshold (mastery or Ž xed cut-off) and the mixed quota/threshold decision. For each scenario it is shown how the sampling distribution of the success rate statistic can be derived. The problems related to the numerical evaluation of the distribution are also discussed, and it is shown by means of example applications that the sampling variability of the success ratio may be quite substantial in many real-world predictorbased selections.

1. Introduction

In educational as well as in organizational settings, predictor-based selection refers to the activity in which procedures believed to predict future criterion behaviour are used to decide which candidates to accept from a given pool of applicants. Atypical example is personnel selection, where new employees are usuallyselected top-down on the basis of their performance on such predictors as interviews, psychological tests, sample work tasks, and so on.

To assess the expected beneŽts of a predictor-based selection either the expected average criterion score of the selected applicants or the expected success ratio of the selection can be used (Boudreau, 1991; Cronbach & Gleser, 1965). Although the former

* Requests for reprints should be addressed to Wilfried De Corte, Faculty of Psychology, H. Dunantlaan 1, B-9000 Ghent, Belgium (e-mail: wilfried.decorte@rug.ac.be).

94 Wilfried De Corte

index is at the core of the models of selection untility (Boudreau, 1991; Brogden, 1946; Cronbach & Gleser, 1965; De Corte, 1994) developed over the past 50 years, it is inappropriate for situations in which future criterion behaviour is assessed in a dichotomous maner. Yet these situations occur quite regularly in both educational and organizational settings in that the criterion behaviour of students and employees is often judged to be either successful or not (Cascio, 1980; Raju, Cabrera, &Lezotte, 1996; Raju, Steinhaus, Edwards, & DeLessio, 1991). In such cases the expected success ratio, which refers to the expected proportion of selected individuals whose level of future criterion behaviour will be at least equal to a pre-established cut-off level, is the only adequate index of the selection beneŽts.

Until recently, the expected success ratio of predictor-based selection has been estimated by means of the Taylor and Russell (1939) formula which is based on the assumption that the predictor, X, and the criterion, Y, have a standard bivariate normal distribution and that the selection is top-down. As shown by De Corte (1999), however, the approach is often inadequate, even when the underlying assumption is satisŽed. For example, and as can be veriŽed by simple simulation, the formula overestimates the success ratio that one may expect to obtain for a decision in which a pre-established number of candidates are selected top-down from a given Žnite pool of applicants. Also, little is known at present about the sampling variability of the success ratio and, hence, about the accuracy with which the success ratio of any particular selection can be predicted.

The purpose of the present paper is to resolve the deŽciencies described above for the three most popular single-stage selection scenarios: the restricted (or Žxed) quota, the threshold (masteryor Žxed cut-off) and the mixed quota/threshold decision (Cascio, 1998; Cronbach & Gleser, 1965; Gatewood & Feild, 1987). More speciŽcally, it is shown for each scenario how the Taylor and Russell (1939) assumption on the bivariate normality of the predictor criterion score distribution can be used to derive the exact sampling distribution of the success ratio statistic. Problems related to the numerical evaluation of the distribution are also discussed, and it is shown by means of example applications that the sampling variabilityof the success ratio may be quite substantial in many real-world selection applications.

Although the paper focuses primarily on the success ratio associated with a single application of the selection, the results obtained are also basic to the assessment of multiple selection applications over time. More speciŽcally, it is shown how the results on the selection of a single cohort provide the means to estimate the expected value and the standard error of the success ratio achieved over a series of independent cohort selections. In addition, an indication is given of how the present framework can deal with the situation in which some of the selected individuals drop out of the process because they decline the proposed offer.

The variability in selection success that derives from uncertainty with regard to the exact value of parameters such as the population predictor validity and the number of recruited applicants is not addressed in this paper. As this issue has been extensively discussed elsewhere (e.g. Alexander & Barrick, 1987; Cronshaw, Alexander, Wiesner, & Barrick, 1987; Rich & Boudreau, 1987), the integration of the present developments with existing approaches in order to handle uncertainty in these parameters is only brie•y discussed. Finally, no attempt is made to generalize the Žndings from single-stage selection situations to more general multistage selection settings. Although these hurdle-based scenarios are not uncommon in the actual practice of, for example, personnel selection (cf. Milkovich & Boudreau, 1997), the

G(N + 1)

Sampling variability

95

presently available results on order statistics are insufŽcient to handle these more complex selection decisions.

2. Restricted quota selection decision

2.1. Expected value of the success ratio

In restricted (or Žxed) quota selection a pre-established number of candidates, n, is selected top-down from a Žnite number of applicants, N, by means of a procedure X that predicts the criterion Y with validity r, the correlation between X and Y. The success ratio of the selection, S, can therefore be conceived of as the average of n

Bernoulli variables, S1, . . . , Sn , that express success (Si =

1) or failure (Si =

0) on the

criterion for the individual with the ith largest predictor score: S =

 

n

 

 

i= 1 Si/n. Also, the

(

 

=

 

)

 

probability that the criterion

probability that Si equals 1, P

Si

 

1

 

, corresponds to the

 

P

 

 

behaviour of individual i, Yi ,

 

is at least equal to

a pre-established

level, yc :

P(Si = 1) = P(Yi $yc).

 

 

 

 

 

 

 

 

 

To obtain an expression for the expected success ratio, E(S), it is henceforth assumed that the N applicants constitute a simple random sample from the relevant applicant population. Following Taylor and Russell (1939), it is also assumed that, in the population, the predictor and the criterion scores have a standard bivariate normal distribution with correlation parameter r. So, although the predictor validity is Žxed at the population level, its value may vary from one sample to the next.

The assumptions imply that the conditional probability of success for an applicant with a predictor score X equal to x can be expressed as 1 ± F( yc; rx; 1 ± r2), where F( yc ; rx; 1 ± r2) is the value of the normal distribution function with mean rx and variance 1 ± r2 evaluated at the cut-off value yc that distinguishes between success and failure on the criterion. Also, by a change of the variable of integration, one obtains that

F( yc; rx; 1

 

± rx

!,

± r2) = FÁ yc1 ± r2

 

 

 

 

 

 

p

where F(·) is the standard normal distribution function.

Thus, using the symbol Xi to denote the predictor score of the ith highest scoring

individual, the conditional density of Si , given Xi = xi , is

.

Fc1 ± r2!#

 

"1 ± FÁ c1 ± r2!#

 

y

± rxi

 

si

 

y

± rxi

 

si

 

p

 

 

p

 

Furthermore, since the predictor scores of the N applicants constitute a simple random sample from the standard normal distribution, the marginal density of the ith largest predictor score, fi(Xi = xi ), can be written as (cf. Stuart & Ord, 1994, p. 475)

fi (Xi = xi ) =

f(xi )[F(xi )]N± i [1 ± F(xi )]i± 1

,

B(N ± i + 1, i)

 

 

where f(·) denotes the standard normal density and

B(N ± i + 1, i) = G(N ± i + 1)G(i)

is the beta function, with G(·) the gamma function.

96 Wilfried De Corte

In combination, the above results indicate that the expected success ratio associated with the top-down selection of n individuals from a total of N candidates can be expressed as

 

n

 

 

 

 

 

 

E(S) =

Xi= 1 E(Si)/n

 

 

1 ± r2!#fi(xi ) dxi.

(1)

= n i=

1

± ¥ "1 ± FÁ

 

c

 

 

n

 

+ ¥

p

 

 

1

X

 

 

 

 

 

 

y

 

± rxi

 

Also, a theorem proven by Gillett (1991) implies that for Žnite n, and thus for all real Žxed quota selection situations, the value of E(S) will be less than that predicted by Taylor and Russell (1939). Using xc to denote the predictor cut-off score that corresponds to the ratio n/N (i.e., xc = F± 1(1 ± n/N)), and F2(xc, yc ) to represent the upper tail probability of the standard bivariate normal distribution evaluated at the cut-offs xc and yc , the latter success rate can be expressed as F2(xc, yc )/[1 ± F(xc )]. Thus, only for the limiting case where n !¥ (and the hiring rate n/N is maintained) will E(S) converge to the value of the success rate as expressed by the Taylor and Russell (1939) formula.

2.2. Standard error of the success ratio

The standard result on the variance of a sum of random variables implies that the variance of the success ratio, Var(S), can be written as

 

 

n

 

n

 

1

Xi 1

 

XXi< j

Var(S) =

n2

( =

Var(Si ) + 2

Cov(Si , Sj)),

where Cov(Si, Sj) denotes the covariance between the success/failure variables Si and Sj associated with the ith and the jth highest scoring individuals, respectively. Also, Var(Si) is equal to E(Si )(1 ± E(Si )), whereas

Cov(Si, Sj) = E(SiSj) ± E(Si)E(Sj),

with E(Si Sj) the expected value of the product SiSj. Since the latter expected value equals the probability that both Si and Sj are equal to one, E(Si Sj) can, for i < j, be determined as

E(SiSj) =

± ¥

± ¥"1 ±

FÁ c

1 ±

r2

!#"1

±

FÁ

1 ± r2

!#gi,j(xi, xj) dxj dxi , (2)

 

+ ¥

xi

 

y

 

±

rxi

 

 

 

 

yc

± rxj

 

 

where

 

 

 

 

 

 

p

 

 

p

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

gi,j (xi , xj) =

f(xi)f(xj)[F(xj )]N± j[F(xi ) ± F(xj)] j± i± 1[1 ± F(xi )]i± 1

 

 

 

 

B(N ± j + 1,

 

j ± i)B(N ± i + 1, i)

 

 

 

 

 

 

 

 

 

is the joint density of the ith and the jth largest predictor score in the sample of the N applicants (cf. Stuart & Ord, 1994).

AFortran77 computer program is available from the author to compute E(S) as well

as the standard error of the statistic, SE(S) =

 

Var(S). The program calls routines from

the Numerical Algorithms Group (NAG)

Mark 19 library (NAG, 1999) to estimate the

 

p

upper tail probabilities of the standard normal distribution and to perform the integrations detailed in (1) and (2). However, when the latter routines are not available, they

Sampling variability

97

can be replaced by comparable ones, such as those documented in the GAMS directory accessible at http://math.nist.gov (Boisvert, Howe, & Kahaner, 1991).

2.3. Sampling distribution of the success rate

With a Žnite number of selected individuals, n, the overall success rate of the selection, S, can take n + 1 different values 0, 1/n, . . . , n/n. Also for an arbitary value, k/n (k = 0, . . . , n), the probability that S equals k/n, P(S = k/n), will be equal to the sum

of ( nk ) elementary probabilities, where each elementary probability corresponds to a set

of values si for the n binary variables Si such that

i si = k. More speciŽcally, using the

symbols S and s to denote the vector of the n

Bernoulli variables and the vector of values

 

P

of these

 

random variables respectively, the probability P(S = k/n) will be equal to

P

s1

 

. . . sn

 

i si

 

k

 

P

 

 

 

s= k P(S =

 

s), where the summation subscript indicates summation over all sets of

values

 

,

 

,

 

such that

P

=

 

, and

 

(S = s) denotes the probability that S equals

the value pattern s.

 

 

 

 

 

 

 

With the above notation, the sampling distribution function, F(S = k/n), can be written as

 

k

 

F(S = k/n) =

Xl= 0 1X¢s= l P(S = s).

(3)

Also, from the conditional density of the Si variables and the joint density of the n largest

predictor scores one obtains that the probability P(S = s) is equal to

 

 

P(S = s) =

± ¥

±

1¥ · · · ±n¥1

(i= 1 f(xi ))[F(xn)]N± n

 

 

 

 

 

 

+ ¥

x

x ±

 

 

n

 

 

 

 

 

 

 

 

 

 

´ ( =

"1 ± FÁ

 

Y

"FÁ

 

1 ± r2!#

 

)G(N ±

 

 

 

 

c1 ± r2!#

c

 

n + 1) dxn . . . dx1,

 

 

n

 

y

 

± rxi

 

si

y

± rxi

 

si

 

G(N + 1)

 

 

 

Yi 1

p

 

p

 

 

 

(4)

where the second term in braces represents the conditional probability that S = s given the predictor score values x1, . . . , xn, and the remainder expresses the joint density of the n highest predictor scores from a sample of size N.

As indicated by (4), the numerical evaluation of the sampling distribution function F(S = k/n) is feasible only for a small number of selected individuals. Even using highly sophisticated (multidimensional) quadrature routines, such as those included in the NAG(1999) library, the computations require an excessive amount of time for n > 5. For a smaller number of selected applicants, a Fortran77 program that uses the Korobov– Conroy number-theoretic method (Korobov, 1963) to approximate the iterated integral of (4) is available from the author.

3. Threshold selection decisions

3.1. Success ratio or number of sucessfully selected

In threshold selection all candidates with scores at least equal to a pre-established predictor cut-off value, xc , are selected. With a Žnite number of applicants this implies the possibility that none of the candidates is selected and that, as a consequence, the success ratio of the selection, S, is not deŽned.

98 Wilfried De Corte

To remove this difŽculty one can either introduce the number of successfully selected candidates as a new index of success, or maintain the initial success rate statistic with the restriction that it applies only to situations in which at least one candidate is selected. The latter option is henceforth adopted because the success ratio is the most popular index of success.

3.2. Sampling distribution

With threshold selection the number of selected candidates, n, follows a binomial distribution with parameters N, the number of candidates, and 1 ± F(xc), the probability that an applicant scores at least equal to the predictor cut-off xc: n, b(N, 1 ± F(xc )). The density function of n is henceforth denoted as b(n| N, 1 ± F(xc )). The assumptions introduced earlier implyalso that a selected applicant has a probability F2(xc , yc)/[1 ± F(xc)] of being successful (i.e., of obtaining a score at least equal to the criterion cut-off yc ). Therefore, the conditional density of m, the number of successfully selected candidates, given n, is also binomial, (m| n) ,b(n, F2(xc, yc )/[1 ± F(xc )]); and b((m| n)| n, F2(xc, yc )/[1 ± F(xc )]) represents the corresponding density function. So, in threshold selection the density function of the success ratio S, f (S = s), is equal to

f (S = s) =

 

b(n| N, 1 ±

F(xc))b((m| n)| n, F2

(xc, yc )/[1 ± F(xc )])

,

(5)

m/n= s

1 ± [F(xc )]n

 

X

1#n#N

where the Žrst summation subscript (m/n = s) indicates summation over all different sets of values for m (with m #n) and n such that m/n = s, the second subscript (1 #n #N) refers to the condition that at least one and no more than N candidates are selected, and the denominator assures that f (S = s) is a proper density function.

Obviously, (5) is not required to determine the expected success ratio of the threshold selection, E(S), because its value is equal to the probability that any one of the selected applicants will be successful. Thus, when at least one candidate is selected, E(S) = F2(xc, yc )/[1 ± F(xc )], the success rate proposed by the Taylor and Russell (1939) formula. The density of (5) is useful, however, to evaluate the standard error and the distribution function of the success rate statistic. Alternatively, since the variance of the

success ratio will be equal to E(S)(1 ±

E(S))/n where n applicants are selected, the

standard error of the success rate can also be computed as

sE(S)(1 ± E(S))

n± 1 b(n| N, 1 ± F(xc))/n.

PN

1± [F(xc )]N

4.Mixed quota/threshold decision

It may happen that neither the Žxed quota nor the Žxed cut-off (or threshold) scenario offers an adequate representation of the actually implemented selection decision. This is particularly true when one decides to select top-down a maximum of n candidates provided that these candidates obtain scores at least equal to a given predictor cut-off value xc. Because the latter scenario is guided by both quota and cut-off considerations, it will be referred to as the mixed quota/threshold or the mixed quota/cut-off decision. Also, to ensure that the success ratio is deŽned, it is again understood that at least one candidate has a predictor score that is no less than the predictor cut-off value xc.

Sampling variability

99

4.1. Expected value and standard error

To estimate the expected value of the success ratio in a mixed quota/threshold selection, it is observed that the population of simple random samples of size N, where at least one candidate scores no less than xc, can be partitioned into two mutually exclusive subsets. The Žrst subset contains samples in which no more than n applicants have a predictor score that is at least equal to xc , while the second subset groups the remaining samples. Since in the Žrst subset all applicants with a predictor score that is at least equal to xc are selected, the expressions for the expected value and the standard error are similar to those obtained for the threshold selection condition. More speciŽcally, using the subscript (1) to indicate the Žrst subset of samples,

E(1)(S) = F2(xc , yc) 1 ± F(xc )

and

 

 

E

(S)(1 ± E

(S))

n

b(i| N, 1 ± F(x

))/i

 

 

 

(1)

 

(1)

 

i= 1

c

 

 

SE(

) =

s

 

n

 

 

± F(xc ))

 

.

1

 

 

i= 1 b(i| N, 1

 

 

 

P

 

 

P

 

 

 

 

 

 

 

 

 

 

 

 

For the second subset, the expected value and the standard error of the success ratio can be determined as

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

n

 

 

 

 

 

 

 

 

 

 

and

 

 

 

 

 

 

E(2)(S) =

 

n Xi= 1 E(2)(Si)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

 

 

 

1

 

Xi 1

 

 

 

 

 

 

 

 

XXi< j

 

 

 

SE(22)(S) = Var(2)(S) =

n2

( =

 

Var(2)(Si ) +

2

Cov(2)(Si , Sj))

 

respectively, where

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

¥

 

xi

 

 

 

 

 

 

 

 

 

 

 

p

 

 

 

 

 

 

 

E(2)

Si

 

xc

xc

 

 

 

 

 

 

 

 

N

 

 

 

 

 

 

 

 

(

 

) =

[1 ± F(( yc ± rxi )/ 1 ± r2)] gi,n+ 1(xi , xn+ 1) dxn+ 1dxi

,

 

 

 

 

 

E(2)(SiP

i= n+ 1 b(i| N, 1 ± F(xc ))

 

 

Var(2)(Si ) =

E(2)(Si )[1 ±

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

)],

 

 

 

 

 

 

 

 

 

 

 

 

 

Cov(2)(Si , Sj) =

E(2)(Si Sj) ±

E(2)(Si )E(2)(Sj)

 

 

 

 

 

 

 

 

 

 

and, for i < j,

 

 

 

xc

xc

xc

"1 ±

 

FÁ

 

 

 

 

 

!#"1 ± FÁ

 

!#

 

E(2)(Si Sj) =

 

 

c

1 ±

r2

1 ± r2

 

 

 

 

 

¥

xi

xj

 

 

 

 

 

 

 

y

±

 

rxi

 

 

 

 

yc

± rxj

 

 

 

 

 

 

 

 

g

 

x

 

 

x

 

 

p

 

 

 

 

p

 

 

 

 

 

´

 

 

i, j,n+ 1

(

i

,

 

j

, x

n

+ 1

)

dxn+

1 dxj dxi ,

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N

 

 

 

 

 

 

 

 

 

 

F(xc))

 

 

 

 

 

 

 

 

Pi= n+ 1 b(i| N, 1 ±

 

 

 

 

 

 

 

 

 

with gi,j,n+ 1(xi , xj, xn+ 1) the joint density of the ith, jth and (n +

1)th largest predictor

scores in the sample of N applicants.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Finally, as the probabilities of obtaining a sample from the Žrst and the second subset

n

F(xc ))

N

F(xc ))

respectively, the expected

are Pi= 1 b(i| N, 1 ±

and Pi= n+ 1 b(i| N, 1 ±

100 Wilfried De Corte

value of the success rate, E(S), in the mixed quota/threshold selection scenario is equal to

 

 

¡P

 

 

 

¢

 

 

 

¡Pc

 

 

 

¢

 

 

 

 

 

 

 

 

n

b(i| N, 1

± F(x

))

 

E

(S) +

N

 

b(i| N, 1

± F(x

))

E

 

(S)

 

E(S) =

 

 

i= 1

 

 

c

 

 

(1)

 

 

i= n+ 1

 

c

 

(2)

 

;

 

 

 

 

 

 

 

 

1 ± [F(x )]n

 

 

 

 

 

 

while the variance of the success ratio, Var (S), can be expressed as

 

¢

 

 

 

 

¡P

n

 

 

 

¢

 

 

 

¡Pc

N

 

 

 

 

 

 

Var(S) =

 

 

 

i= 1 b(i| N, 1

± F(xc)) Var(1)(S) +

i= n+ 1 b(i| N,

1 ± F(xc))

Var(2)(S)

 

¡P

 

 

 

 

 

 

1 ±

[F(x )]n

 

 

 

 

 

 

 

+

 

 

1 ± [F(xc¢)]n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n

 

b(i| N, 1 ± F(x

)) (E

(S) ± E(S))2

 

 

 

 

 

 

 

 

 

 

 

i= 1

 

 

c

 

 

(1)

 

 

 

 

 

 

 

 

 

 

+

¡Pi= n+ 1 b(i| N, 1 ± F(xc))¢(E(2)(S) ± E(S))2

 

 

 

 

 

 

 

 

 

 

N

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1± [F(xc )]n

4.2.Sampling distribution

To obtain the distribution function of the success ratio in the mixed condition, it is observed that the probability that the success ratio S equals s, P(S = s), is the sum of a number ofelementaryprobabilities, denoted as P(S(l) = s(l)), where S(l) and s(l) represent a vector of l Bernoulli variables and a vector of values for these variables, respectively.

More speciŽcally, P(S = s) will in mixed quota/threshold selection be equal to

l

X1 n

X

P(S = s) =

 

P(S(l) = s(l)),

 

m/l= s

1¢s(l) = m

 

= ,...,

 

m= 0,...,l

 

where the Žrst summation is over all combinations of values for the number selected l (with l = 1, . . . , n) and the number of sucessfully selected m (with m = 0, . . . , l) such that m/l = s, and the second summation groups all possible ways in which the values s(il) of s(l) can sum to m. Also, for 1 #l < n,

 

 

 

 

 

 

 

 

+ ¥

 

 

x1

 

 

 

 

xl± 1

 

xc

 

 

 

 

 

 

 

 

 

 

 

 

 

P(S(l) = s(l)) = f 1 ± [F(xc)]n g ± 1 xc

xc

· · · xc

± ¥

 

 

 

 

 

 

 

 

 

 

l

 

 

 

 

y

 

 

± rxi

 

si(l)

 

 

 

y

 

 

± rxi

 

si(l)

 

 

 

8

1

± F

 

 

c

 

 

 

 

F

 

c

 

 

 

9

 

<i= 1"

Á

 

 

1 ± r

 

!# " Á

 

 

1 ± r

 

!#

 

 

 

=

 

Y

 

 

p

 

 

 

 

 

 

 

p

 

 

 

 

 

 

 

 

 

:

l+ 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

G(N + 1)

 

 

 

 

 

 

 

;

 

´ (

 

f(xi))[F(xl+ 1)]N± l± 1

 

 

dxl+ 1 . . . dx1,

(6)

=

G(N ± l)

while, for l = n,

Yi 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

xc

xc

· · ·

xc

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

P(S(l) = s(l)) = f 1 ± [F(xc)]n g ± 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

+ ¥

 

 

 

x1

 

 

 

 

xl± 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8

1 ± F

 

 

 

 

c

 

 

 

 

 

 

 

 

 

F

 

 

 

c

 

 

 

 

 

 

 

 

 

 

9

 

<i=

1"

 

Á

 

 

1 ± r

!#

si(l)

" Á

 

 

 

1 ± r

!#

1

± si(l)

=

 

l

 

 

 

 

y

 

± rxi

 

 

 

 

y

 

± rxi

 

 

 

Y

 

 

 

p

 

 

 

 

 

 

 

 

 

p

 

 

 

 

 

 

 

:

l

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

G(N +

1)

 

 

 

 

 

 

 

 

 

 

;

 

´ (i=

f(xi))[F(xl+ 1)]N± l

 

 

dxl . . . dx1.

(7)

G(N ± l + 1)

 

Y1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Sampling variability 101

A Fortran77 program written by the author to compute the probabilities P(S = s) uses the Korobov–Conroy number-theoretic method to evaluate the multidimensional integrals of (6) and (7). Although the program can be used only when no more than Žve candidates are selected, it is still of practical value because many decisions, especiallyin the context of personnel selection, relate to the hiring of only a small number of applicants. When more than Žve candidates need to be selected, as is quite normal in educational contexts, the exact sampling distribution can be approximated using techniques that are described in Section 8 below.

5. Application: Success ratio of personnel selection

The example application relates to a typical personnel selection decision in which a predictor with a validity(r) of 0.35 is used to hire a limited number of employees from a given pool of candidates. All three previously discussed selection scenarios are considered, and for each scenario the expected value, the standard error and the exact sampling distribution function of the success ratio are derived for three different levels of selectivity and a variety of values for the number of selected applicants. To ensure comparability between the three selection scenarios, the different levels of selectivity are expressed in terms of the hiring rate n/N for the Žxed quota and the mixed quota/cut-off conditions, or the corresponding standard normal deviate for the Žxed cut-off situation. Thus, given the numbers N and n, the corresponding predictor cut-off xc is determined as F± 1(1 ± n/N), and the Žxed quota selection of n employees from N applicants is compared to the Žxed cut-off scenario with predictor cut-off value xc and to the mixed quota/cut-off scenario where at most n applicants are hired provided that theyscore at least xc on the predictor. In terms of the hiring rate, the three levels of selectivity are 0.5, 0.2 and 0.1, with corresponding predictor cut-off values of 0.0, 0.842 and 1.282, respectively. Also, in all calculations a newly hired employee is considered successful when he/she has a criterion score yc at least equal to 0.5, which means that the performance of a new employee must be at least half a standard deviation above the mean criterion performance in the entire applicant population in order to be successful.

Table 1 summarizes the results of the calculations for the expected value and the standard error of the success ratio. The reported values conŽrm that the expected success ratio of a Žxed quota selection is always smaller than that obtained for the corresponding Žxed cut-off decision. This is also true for the two cases (where N = 100, n = 10 and N = 250, n = 50, respectively) where the values of the expected success ratio are reported as equal because in both cases the equality merely expresses the fact that the difference between the corresponding estimates is less than 0.001. In addition, the results show that the Žxed cut-off scenario performs less well than the mixed quota/ cut-off scenario. As expected, the difference between the scenarios becomes smaller with a growing number of new employees, indicating that the three scenarios become more similar in that the expected success ratio converges to that of the Žxed cut-off decision when n tends to inŽnity.

Alternatively, when the number to be hired is small, the standard errors of the success ratio reported in Table 1 indicate that the success ratio obtained may vary considerably from one application to the other. This is further exempliŽed by the plots of the sampling distribution functions in Fig. 1, where each plot corresponds to the indicated combination of values for the number of applicants, N, and the number of selected candidates, n.

102 Wilfried De Corte

Table 1. Expected value (E) and standard error (SE) of the success ratio for the Ž xed quota, Ž xed cutoff and mixed quota/cut-off selection scenarios when r = 0.35 and yc = 0.50

 

 

 

 

 

 

 

Selection scenario

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Fixed quota

 

Fixed cut-off

Mixed quota/cut-off

Hiring

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

rate

N

n

 

E

SE

 

E

SE

E

SE

 

 

 

 

 

 

 

 

 

 

0.5

2

1

0.379

0.485

0.408

0.449

 

0.424

0.494

 

4

2

0.391

0.347

0.408

0.372

 

0.421

0.393

 

10

5

0.401

0.221

0.408

0.235

 

0.418

0.245

 

20

10

0.405

0.157

0.408

0.160

 

0.416

0.166

 

100

50

0.408

0.070

0.408

0.070

 

0.412

0.072

0.2

5

1

0.461

0.499

0.495

0.444

 

0.512

0.500

 

10

2

0.477

0.355

0.495

0.380

 

0.508

0.404

 

25

5

0.487

0.225

0.495

0.248

 

0.504

0.259

 

50

10

0.491

0.160

0.495

0.166

 

0.503

0.172

 

250

50

0.495

0.071

0.495

0.071

 

0.499

0.073

0.1

10

1

0.516

0.500

0.548

0.439

 

0.563

0.496

 

20

2

0.530

0.354

0.548

0.378

 

0.559

0.402

 

50

5

0.540

0.224

0.548

0.250

 

0.556

0.260

 

100

10

0.544

0.159

0.548

0.166

 

0.554

0.173

 

500

50

0.547

0.071

0.548

0.071

 

0.551

0.073

 

 

 

 

 

 

 

 

 

 

 

 

Within the plots, three different sampling distribution functions are drawn, one for each of the different selection scenarios. To illustrate the usage of these plots, consider the situation where four employees are selected from a pool of 40 candidates. In that case, the Žgure indicates that at least one in every six implementations of the selection may result in success ratio of no more than 0.25: P(S #0.25) = 0.259, 0.165 and 0.221 for the Žxed quota, the Žxed cut-off and the mixed quota/cut-off scenario, respectively. Also, when only a few individuals are selected, the plots show that there is a substantial chance that none of them will be successful.

6. Multiple independent cohort selections

When an institution adopts a new predictor (or predictor composite) to improve selection, it usually intends to implement the predictor on more than one occasion. As a consequence, the overall number of people selected on the basis of the predictor may become quite large, and it may therefore be surmised that the gains predicted by the present and the traditional formulae will converge. This conclusion is not justiŽed, however.

Consider, for example, the situation where over a period of ten years an organization hires two new employees each year using the same predictor and an identical hiring rate (i.e., ten identical quota decisions). In that case, the expected success ratio over all 20 employees selected will be exactly equal to the expected success ratio associated with any one of the ten selections. More generally, with independent cohort selections, the