
Словари и журналы / Психологические журналы / p159British Journal of Mathematical and Statistical Psycholo
.pdf
159
British Journal of Mathematical and Statistical Psychology (2002), 55, 159–168
© 2002 The British Psychological Society
www.bps.org.uk
Estimation of the Wing–Kristofferson model for discrete motor responses
Jarl K. Kampen1 * and Tom A. B. Snijders2
1Katholieke Universiteit Leuven, Belgium
2Rijksuniversiteit Groningen, The Netherlands
A number of estimation methods for the variance components in the Wing– Kristofferson model for inter-response times are examined and compared by means of a simulation study. The estimation methods studied are the method of moments, maximum likelihood, and an alternative approach in which the Wing–Kristofferson model is recognized as a moving average model.
1. Introduction
The Wing–Kristofferson (WK) model (Wing & Kristofferson, 1973) was developed to account for two sources of variation in simple repeated motor responses, such as rhythmic tapping with a pencil on a table: a timing component and a component associated with the motor implementation of the response. Such data concerning repeated motor responses have been used to investigate whether respondents have problems in timing their movements rather than in implementing them. For example, Kooistra, Snijders, Schellekens, Kalverboer, and Geuze (1997) investigated whether motor problems in children with early-treated congenital hypothyroidism, in which the thyroid gland is missing or defective, could be explained in terms of a timing deŽcit. For this purpose, they analysed the results of the so-called tapping task, in which children are asked to perform rhythmic tapping; the times between taps are recorded and the WK model then Žtted. It was concluded that the variance of the motor implementation was signiŽcantly higher for children with congenital hypothyroidism than for controls. In this paper, the estimation methods for the WK model are re-examined and some new methods are proposed. The methods are compared by means of a simulation study. We also provide an applied example of the WK model.
Requests for reprints should be addressed to Jarl Kampen, Department Politieke Wetenschappen, Katholieke Universiteit Leuven, E. Van Evenstraat 2A, B-3000 Leuven, Belgium. (e-mail: io@soc.kuleuven.ac.be).

160 Jarl K. Kampen and Tom A. B. Snijders
2. The Wing–Kristofferson model for discrete motor responses
Let us call the time elapsing between two taps of a respondent the response interval. Wing and Kristofferson (1973) assume that the taps are paced internallyby the moment at which the implementation of each tap is started, and that a motor implementation delay intervenes between the implementation start and the observed tap. The time elapsing from the beginning of the implementation of the Žrst tap until the beginning of the implementation of the second tap is the timing interval of the respondent. Associated with each response interval are the motor delays for the Žrst and second tap. Hence, Wing and Kristofferson (1973) propose the model
rt = ct + dt ± dt± 1, |
(1) |
where rt is the observed tth response interval, ct is the timing interval associated with the tth response interval, and dt± 1 and dt are the motor delays associated with the preceding and present tap of the tth response interval, respectively. Let T represent the total number of response intervals rt, t [ (1, . . . , T ), to be analysed. It is assumed that the ct are independent and identically distributed (i.i.d.) with a normal distribution N(m, v). The dt are i.i.d. with a normal distribution N(h, f); the parameter h is unidentiŽable and for all purposes may be assumed to be 0 (or any other constant). It is further assumed that the timing intervals and the response delays are independent, which implies Var(rt) = v + 2f and Cov(rt, rt+ 1) = ± f.
3. Estimation of the Wing–Kristofferson model
3.1. Moment estimators
Wing and Kristofferson (1973) suggested estimating f by minus the observed lag-one autocovariance, and estimating v by the observed variance of the response intervals minus twice the estimate of f. This follows from equating sample moments to population moments. Kooistra et al. (1997, p. 65) propose a modiŽed estimator in which sample moments are equated to their expected values. The resulting estimators are then given by
ˆ
v =
ˆ
f =
where
rˆ =
and Žnally
1 XT
mˆ = T t = 1 rt,
|
(T ± 1)2(Tjˆ 2 + (2T + 1)rˆ ) |
, |
|
|
||||||
|
|
T 3 ± 4T 2 + 4T |
+ 2 |
|
|
|
||||
|
|
|
|
|
|
|
||||
|
± T ((T ± 2)jˆ 2 + (T ± 1)2rˆ ) |
|
||||||||
|
|
|
|
|
|
|
|
, |
|
|
|
|
T 3 ± 4T 2 + 4T |
+ 2 |
|
|
|
||||
|
|
|
|
|
|
|
||||
jˆ 2 = T ± 1 Xt = 1 (rt ± mˆ )2, |
|
|
|
|
|
|||||
|
|
1 |
T |
|
|
|
|
|
|
|
1 |
|
T± 1 |
|
|
|
|
|
|
|
|
|
Xt = 1 (rt ± mˆ 1)(rt+ 1 ± mˆ 2) |
|
||||||||
|
|
|||||||||
T ± 1 |
|
|||||||||
|
mˆ 1 |
= T ± 1 Xt = 1 rt, |
mˆ 2 |
= T ± 1 Xt = 2 rt. |
||||||
|
|
1 |
T± 1 |
|
|
1 |
T |
(2)
(3)
(4)

Estimation for the Wing-Kristofferson model 161
For large T , the exact moment estimators converge to the WK estimates. For small T , both the procedures suggested by Wing and Kristofferson and the method of moments can lead to the estimation of negative variances. Evidence suggests that in practical cases, negative variances are computed in up to 30%of the number of respondents (Ivry & Keele, 1989; Lundy-Ekman, Ivry, Keele, & Woolacott, 1991). This may be regarded as an undesirable feature of these estimators.
3.2. Maximum likelihood estimators
An advantage of maximum likelihood estimators is that negative variances will not occur. Collect the response intervals in the T-vector r = (r1, . . . , rT )¢, the timing intervals in the T -vector c = (c1, . . . , cT )¢, and the motor implementation delays in the (T + 1)-
vector d = (d0, d1, . . . , dT )¢. DeŽne the elements of the T ´ (T + 1) matrix A by aij = ± 1 if i = j; aij = 1 if i = j ± 1; and aij = 0 for other i, j. Then we obtain the matrix representation of the WK model.
r = c + Ad.
The distributional assumptions for c and d imply that r has a T-variate normal distribution N(m1, S) where 1 is the unit T -vector and S = vIT + fAA¢, with IT denoting the identity matrix of order T. The log-likelihood function of the parameters conditional on the response intervals is given by
,(v, f; r) ~± 12 ln | vIT + fAA¢| ± 12 (r ± m)¢(vIT + fAA¢)± 1(r ± m).
Several algorithms can be used to maximize this log-likelihood as a function of v and f. One possibility is based on recognizing that S = vIT + fAA¢ is a tridiagonal matrix. Expressions for determinants and inverses of such matrices are given by Miller (1987, p. 65). Using these expressions, direct numerical maximization is possible using any standard numerical algorithm. Another possibility is to regard the vector d as missing data, and use the EMalgorithm (Dempster, Laird, & Rubin, 1977) to produce maximum likelihood estimators. Using straightforward but tedious algebra, the iterative estimators of the variance components by EMcan be shown (see the Appendix for a sketch of the proof ) to be deŽned by the iteration steps
ˆ(i+ 1) |
|
1 |
|
(i) |
AA¢z |
(i) |
± m)¢(r |
|
(i) |
AA¢z |
(i) |
± m) + |
|
(i) |
|
|
|
(i) |
A¢(S |
(i) |
± 1 |
A)A¢)), |
||||||||||
v |
= |
T |
((r ± |
f |
|
|
|
± f |
|
|
|
f |
|
tr(A(IT+ 1 ± f |
|
) |
|
|||||||||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(5) |
|
|
|
ˆ (i+ 1) |
|
|
1 |
|
|
|
(i) |
|
|
|
(i) |
|
(i) |
± 1 |
|
|
(i) |
|
(i) |
|
(i) |
|
|
|
|
||||
|
|
|
f |
|
= |
|
|
|
(f |
|
(tr(IT+ 1 ± f |
|
A¢(S ) |
|
A) + f |
z |
|
¢A¢Az |
|
|
)), |
|
|
(6) |
||||||||
|
|
|
|
T + |
|
1 |
|
|
|
|
|
|
|
|
||||||||||||||||||
where z(i) = (S(i))± 1(r ± |
|
m), S(i) = |
v(i)IT |
+ f(i)AA¢, and starting values v(0) and f(0) can be |
chosen to be one of the moment estimates if they are positive, and arbitrary positive numbers otherwise, e.g. a proportion of the observed variance of r.
3.3. The moving average approach
Another possibility for estimating the variance components of the WK model is by recognizing it as a Žrst-order moving average or MA(1) process. Write the MA(1) process
yt = ut + but± 1,
162 Jarl K. Kampen and Tom A. B. Snijders
where yt = rt ± m and it is assumed that ut ,iid N(0, v) for all t. Well-known properties |
||||
of the MA(1) process are Var(y |
) = |
n(1 + b2) and Cov(y |
, y |
) = bn = ± f, whence, |
t |
|
t |
t+ 1 |
|
from Var(yt) = v + 2f we have v = n(1 + b)2. Because of the |
restrictions v $0 and |
|||
n $0, the model constrains b #0. |
|
|
|
In the extensive literature on the MA(1) process, several procedures for producing maximum likelihood estimators of the parameters n and b are proposed; see, for instance, Box and Jenkins (1971, p. 187). However, maximum likelihood estimators of the variance components of Žrst-order moving average processes are known to be unreliable: the maximizing values do not always correspond to a plausible estimate due to a ‘pile-up effect’ (Anderson & Mentz, 1993). Also, maximum likelihood estimates are reported to be unstable (Godolphin & De Gooijer, 1982). These problems may also apply to the maximum likelihood estimators proposed in Section 3.2. In the literature, several alternative estimators have been proposed for the MA(1) process. For instance, Galbraith and Zinde-Walsh (1994) consider a non-iterative estimator by autoregressive approximation. That is, the MA(1) process is approximated by an autoregressive process AR( p), and the parameters of the model are estimated by ordinary least squares. Upon
deŽning the (T ± p) ´ p matrix Xp with elements given by xij = |
ri+ j± 1, j = 1, . . . , T ± p, |
|||
and the (T ± p)-vector y = (rT± p, . . . , rT )¢, the coefŽcients of the autoregression are given |
||||
by the p-vector |
|
|
|
|
à |
(Xp¢ Xp) |
± 1 |
Xp¢ yp, |
(7) |
bp = |
|
of which the pth element is used as the estimate of b (Galbraith & Zinde-Walsh, 1994,
p. 145). Using jˆ |
2 |
as deŽned in (4), n is estimated as nˆ = jˆ |
2 |
/(1 + |
ˆ |
2 |
), and the estimators of |
||
|
|
b |
|
||||||
the variance components of the WK model are |
|
|
|
|
|
|
|||
|
|
ˆ |
ˆ |
2 |
|
|
|
|
(8) |
|
|
v = nˆ |
(1 + b) |
|
|
|
|
|
|
and |
|
|
|
|
|
|
|
|
|
|
|
ˆ |
ˆ |
|
|
|
|
|
(9) |
|
|
f = |
± bnˆ . |
|
|
|
|
|
We refer to these estimators of the variance components of the WK model as the ARMA estimators. Regarding the choice of p in (7), Galbraith and Zinde-Walsh (1994, p. 149) suggest that for small T (say T #25) values of p ranging from 1 to 5 will provide good estimators.
4. A comparison of estimation methods of the Wing–Kristofferson model
4.1. A simulation study
In this section, we compare the efŽciency of the moment estimators as given by Wing and Kristofferson, the moment estimators given by Kooistra et al. ((2) and (3) above), the maximum likelihood estimators as deŽned by EM ((5) and (6)), and the ordinary least squares based estimators obtained in the ARMA approximation ((8) and (9)) by comparing their bias and mean squared error as obtained in a simulation study. The mean squared error of some estimator z of a parameter Z is deŽned by MSE(z) = E((z ± Z )2). The MSEs of the estimators are a function of T, v and f. We set m = 1000. If we take v + 2f = c for some positive constant c, the behaviour of the estimators can be illustrated adequately by manipulating only the number of taps T and the ratio of the variance components q = f/v (because up to a multiplicative term, the MSEs of the
Estimation for the Wing-Kristofferson model 163
estimated variance components are unaltered if the variance components are multiplied by a constant). We took c = 100, T [ (20, 200), and v [ (0, 10, 20, . . . , 100), corresponding to q [ (¥, 4.5, 2.0, 1.2, 0.75, 0.50, 0.33, 0.21, 0.13, 0.06, 0). In line with the suggestion of Galbraith and Zinde-Walsh (1994, p. 149), we set p = 3 if T = 20 and p = 24 if T = 200. (A pilot study where T = 20 suggested that in fact p = 3 produces more efŽcient estimates than p = 1, 2 or 4.)
The routines (available on request) for the simulations were written in GAUSS 3.2. Samples of c and d were drawn from normal distributions N(m, v) and N(0, f) respectively, and these vectors were linearly combined to form the vector r = c + Ad. Then the four different estimates of the variance components were computed and negative estimates were set to zero. This procedure was repeated for N = 10 000 runs per parameter setting when T = 20, and N = 100 runs for T = 200; the MSEs were estimated as the average over the N runs of the observed squared errors. It was found that the likelihood function can have a local maximum which is not the global maximum. This implies that the choice of starting values is important. We tried various reasonable starting values and present here only the results for the starting value yielding the uniformly smallest root mean squared errors (RMSEs). These starting values are deŽned by var(r)/2 for v and var(r)/4 for f.
The results of the simulations for T = 20 are shown in Tables 1–4, where the RMSE and bias of each of the four estimators are displayed as a function of v and f, respectively. The results ofthe simulations for T = 200 (not printed here) suggestthat all estimators are consistent. As for T = 20, and considering the timing variance v, Wing and Kristofferson’s moment estimator (abbreviated WKMom. in the tables) and the ARMAestimator perform better than the moment estimator of Kooistra et al. (K Mom. in the tables) in terms of RMSEvalues. Thus, the simulations do not support the suggestion of Kooistra et al. (1997) that their proposed estimator is better than the WK estimator. Comparatively speaking, ARMAtends to perform better with high v, and the WK moment estimators with low v. Nevertheless, the maximum likelihood estimator outperforms the three other approaches. All estimators except the moment estimator of Kooistra et al. have considerable bias outside the middle of the range of v-values, and for high v, the bias of the maximum likelihood estimator has an important contribution to its RMSE. For the motor variance f, on the other hand, the maximum likelihood estimator as compared to the two moment estimators has a larger RMSEfor small values of f and a smaller RMSEfor high values of f. The RMSE of the ARMA is uniformly lower than that of the other approaches. Accordingly, we propose a hybrid procedure in order to estimate the variance components of the WKmodel; that is, to estimate v by the maximum likelihood estimator and to estimate f by the ARMAestimator.
4.2. Example
For illustrative purposes, we include an applied example of the WK model, reanalysing the data gathered by Kooistra et al. (1997) on timing variability in children with earlytreated congenital hypothyroidism (see Introduction). The objective is to show whether or not the different estimation methods lead to similar conclusions. In the present study, a group of children suffering from thyroid agenesis (n = 21) was compared to a group of children suffering from thyroid dysgenesis (n = 25) and a group of controls (n = 34). The main conclusions that were drawn in the study were that the timing variance v does not differ across groups, and that the estimate of the motor delay variance f of children with early-treated congenital hypothyroidism is signiŽcantly

164 Jarl K. Kampen and Tom A. B. Snijders |
|
|
|
||
Table 1. |
ˆ |
|
|
20, N = 10 000) |
|
RMSE (v) as a function of the estimators (T = |
|
||||
|
|
|
|
|
|
|
v |
WK Mom. |
K Mom. |
ML |
ARMA |
|
|
|
|
|
|
|
0 |
21.64 |
25.54 |
7.103 |
22.00 |
|
10 |
22.21 |
26.62 |
11.49 |
25.40 |
|
20 |
25.07 |
29.82 |
17.77 |
29.84 |
|
30 |
28.03 |
32.85 |
23.04 |
32.94 |
|
40 |
31.91 |
36.76 |
28.08 |
35.67 |
|
50 |
35.22 |
40.05 |
32.31 |
38.13 |
|
60 |
39.24 |
44.16 |
36.43 |
40.48 |
|
70 |
42.59 |
47.66 |
39.61 |
41.90 |
|
80 |
45.86 |
50.95 |
43.20 |
43.34 |
|
90 |
49.44 |
54.71 |
46.71 |
44.66 |
|
100 |
54.02 |
59.66 |
50.20 |
45.19 |
|
|
|
|
|
|
Table 2. |
ˆ |
|
|
20, N = 10 000) |
|
Bias (v) as a function of the estimators (T = |
|
||||
|
|
|
|
|
|
|
v |
WK Mom. |
K Mom. |
ML |
ARMA |
|
|
|
|
|
|
|
0 |
12.35 |
14.88 |
3.812 |
13.03 |
|
10 |
7.301 |
10.72 |
± 0.5486 |
10.51 |
|
20 |
3.443 |
7.902 |
± 3.947 |
8.833 |
|
30 |
± 0.0661 |
5.500 |
± 6.635 |
7.127 |
|
40 |
± 3.151 |
3.464 |
± 9.633 |
4.836 |
|
50 |
± 5.571 |
2.214 |
± 13.15 |
2.956 |
|
60 |
± 7.932 |
0.9958 |
± 16.32 |
0.1433 |
|
70 |
± 9.482 |
0.6916 |
± 19.65 |
± 2.603 |
|
80 |
± 11.72 |
± 0.3894 |
± 23.65 |
± 4.896 |
|
90 |
± 13.33 |
± 0.7963 |
± 28.17 |
± 7.640 |
|
100 |
± 14.86 |
± 1.141 |
± 32.90 |
± 9.722 |
|
|
|
|
|
|
Table 3. |
ˆ |
|
|
20, N = 10 000) |
|
RMSE (f) as a function of the estimators (T = |
|
||||
|
|
|
|
|
|
|
f |
WK Mom. |
K Mom. |
ML |
ARMA |
|
|
|
|
|
|
|
0 |
18.41 |
17.38 |
22.80 |
18.40 |
|
5 |
18.98 |
18.27 |
22.49 |
17.69 |
|
10 |
19.91 |
19.55 |
22.56 |
17.16 |
|
15 |
21.27 |
21.31 |
22.70 |
16.74 |
|
20 |
22.84 |
23.27 |
22.47 |
16.60 |
|
25 |
23.41 |
24.16 |
22.07 |
16.07 |
|
30 |
25.58 |
26.62 |
21.63 |
15.50 |
|
35 |
26.01 |
27.20 |
20.84 |
14.68 |
|
40 |
28.07 |
29.49 |
19.65 |
13.44 |
|
45 |
27.97 |
29.44 |
17.95 |
11.58 |
|
50 |
30.11 |
31.69 |
17.27 |
9.700 |
|
|
|
|
|
|

Estimation for the Wing-Kristofferson model 165
Table 4. |
ˆ |
|
|
|
|
Bias (f) as a function of the estimators (T = 20, N = 10 000) |
|
||||
|
|
|
|
|
|
|
f |
WK Mom. |
K Mom. |
ML |
ARMA |
|
|
|
|
|
|
|
0 |
10.96 |
9.511 |
14.78 |
11.94 |
|
5 |
8.903 |
7.313 |
12.76 |
9.550 |
|
10 |
7.280 |
5.627 |
10.88 |
7.271 |
|
15 |
5.878 |
4.255 |
9.284 |
5.311 |
|
20 |
4.768 |
3.274 |
7.662 |
3.426 |
|
25 |
3.234 |
1.878 |
5.935 |
1.428 |
|
30 |
2.792 |
1.735 |
4.312 |
± 0.0086 |
|
35 |
1.653 |
0.8672 |
2.608 |
± 1.347 |
|
40 |
1.509 |
1.133 |
1.151 |
± 2.106 |
|
45 |
0.0690 |
0.0913 |
± 1.887 |
± 6.413 |
|
50 |
0.1030 |
0.5905 |
± 5.750 |
± 4.213 |
|
|
|
|
|
|
higher than that of controls; differences were tested by means of a t test comparing the means of the estimated variance components (as obtained by the exact method of moments) of the three groups in a pairwise fashion.
Means as well as standard errors of the means of the variance components as estimated by the four methods discussed in the previous sections are displayed in Table 5. If a negative variance was estimated (this happens in the moment approaches and the ARMA approach), the negative estimates were set to zero. Concerning the timing variance v, the results of Kooistra et al. (1997, p. 69) are corroborated: no signiŽcant differences between the congenital hypothyroidism groups combined (n = 46) and the controls were found in either estimation method. Concerning the motor delayvariance f, signiŽcant differences between groups were found for the exact moment estimates and those obtained from the ARMA approach; but no signiŽcant difference was detected between means of the WK moment estimates and the maximum likelihood estimates of f. However, if we take into consideration that in general, the motor variance f is estimated better by the moving average approach than by any of the other methods, the conclusion that children with congenital hypothyroidism have higher motor variability is justiŽed.
Table 5. Means (and estimated standard errors of the mean) of estimated variance components by group and by method
|
|
Estimate of v |
|
|
|
Estimate of f |
|
||
|
|
|
|
|
|
|
|
|
|
Group |
WK Mom. |
K Mom. |
ML |
ARMA |
|
WK Mom. |
K Mom. |
ML |
ARMA |
|
|
|
|
|
|
|
|
|
|
1 |
817.1 |
971.0 |
765.2 |
839.0 |
499.5 |
485.6 |
422.1 |
495.3 |
|
|
(223.9) |
(257.1) |
(213.6) |
(190.4) |
(144.7) |
(152.7) |
(85.3) |
(108.1) |
|
2 |
1039.2 |
1212.8 |
936.4 |
1238.5 |
436.4 |
438.8 |
366.5 |
371.5 |
|
|
(265.7) |
(309.0) |
(205.8) |
(280.3) |
(151.7) |
(153.1) |
(126.7) |
(125.4) |
|
3 |
997.7 |
1188.0 |
769.5 |
1108.6 |
208.8 |
182.2 |
262.8 |
218.2 |
|
|
(212.7) |
(250.3) |
(164.6) |
(265.8) |
(29.8) |
(28.0) |
(38.6) |
(37.4) |
|
|
|
|
|
|
|
|
|
|
|
166 Jarl K. Kampen and Tom A. B. Snijders
5. Closing comments
In some applications of the WK model, researchers may be interested in the standard errors of the parameters at the individual rather than aggregate level (as in the example in Section 4.2.). The EMapproach can be modiŽed to yield standard errors (e.g., the SEM algorithm; see Meng &Rubin, 1991). For the other estimation methods, the derivation of analytical expressions of the standard errors is cumbersome, and for the moment estimators such standard errors would rely on fourth moments of the data and would therefore be very unstable. In these cases, we recommend that standard errors of the parameters involved are produced by means of the parametric bootstrap (Efron & Tibshirani, 1993), which is a simple and reliable procedure.
Finally, we have assumed that the response intervals are free from a so-called drift in the timing intervals. The WK model adjusting for a linear drift is given by
rt = ct + dt ± dt± 1 + qt
and adjusts the observed response intervals for a systematic acceleration or delay (conceptually, q corresponds to processes of the cerebrum rather than the cerebellum). Astudy of the estimation of the drift parameter is beyond the scope of this paper. Note, however, that moment estimators of the drift parameter q were given by Kooistra et al. (1997), and that they found no evidence of a drift in their data.
Acknowledgement
The authors are grateful to Libbe Kooistra for permission to use his data.
References
Anderson, T. W., & Mentz, R. P. (1993). A note on maximum likelihood estimation in the Žrstorder Gaussian moving average model. Statistics and Probability Letters, 16, 205–211.
Box, G. E. P., & Jenkins, G. M. (1971). Time series analysis: forecasting and control. London: Holden-Day.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EMalgorithm (with discussion). Journal of the Royal Statistical Society B, 39, 1– 38.
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.
Galbraith, J. W., & Zinde-Walsh, V. (1994). A simple non-iterative estimator for moving average models. Biometrika, 81, 143–155.
Godolphin, E. J., & De Gooijer, J. G. (1982). On the maximum likelihood estimation of parameters of a Gaussian moving average process. Biometrika, 69, 443–451.
Ivry, R. B., & Keele, S. W. (1989). Time functions of the cerebellum. Journal of Cognitive Neurosciences, 6, 167–180.
Kooistra, L., Snijders, T. A. B., Schellekens, J. M. H., Kalverboer, A. F., &Geuze, R. H. (1997). Timing variabilityin children with early-treatedcongenital hypothyroidism. Acta Paediatrica, 96, 61– 73.
Lundy-Ekman, L., Ivry, R. B., Keele, S. W., & Woolacott, M. H. (1991). Timing and force control deŽcits in clumsy children. Journal of Cognitive Neurosciences, 3, 367–376.
Meng, X. L., &Rubin, D. B. (1991). Using EMto obtain asymptotic variance–covariance matrices: the SEM algorithm. Journal of the American Statistical Association, 86, 899–909.

Estimation for the Wing-Kristofferson model 167
Miller, K. S. (1987). Some eclectic matrix theory. Malabar, FL: Krieger.
Wing, M. W., & Kristofferson, A. B. (1973). Response delays and the timing of discrete motor responses. Perception & Psychophysics, 14, 5–12.
Received 30 April 1999; Ž nal version received 4 April 2001
Appendix
We give a brief derivation of the iterative maximum likelihood estimators as found by EM. First the expectation step (E-step) and then the maximization step (M-step) are presented. To apply the EM algorithm, we deŽne the complete data vector y = (r¢, d¢)¢, and the missing data by d. Note that y is normally distributed with mean vector mY = (m¢, 0¢)¢ and covariance matrix
SY = |
³fA¢ |
fIT+ 1 ´. |
|
S |
fA |
The vector y is a linear combination of the vectors c and d. This implies that, up to a multiplicative constant, the likelihood based on y is identical to the joint likelihood of the vectors c and d. The distributional assumptions for the latter two vectors generate the complete data log-likelihood
,(v, f, m; y) ~± 12 T ln v ± 12 v± 1(c ± m)¢(c ± m) ± 12 (T + 1) ln f ± 12 f± 1d¢d.
The E-step of the EM algorithm consists of Žnding an expression for
E(,(v, f; y)| r, v(i), f(i), m(i)),
i.e. the expected log-likelihood of the complete data given the observed data and given the current values v(i), f(i), m(i) of the parameters. The maximum likelihood estimate of m turns out to be very close to the sample mean of r, as Prt = Pct + dT+ 1 ± d0, and E(dT+ 1 ± d0) = 0. Therefore the estimation can be simpliŽed slightly by using
1 XT
mˆ = T t = 1 rt
right from the start. From multivariate normal distribution theory, the conditional expectation of d given r is E(d| r) = fA¢z, where z = S± 1(r ± m), and its conditional covariance matrix is Var(d| r) = f(IT+ 1 ± fA¢S± 1A). Finally,
E(d¢d; r, v(i), f(i)) = f(i)(tr(IT+ 1 ± f(i)A¢(S(i))± 1A) + f(i)z(i) ¢A¢Az(i)),
where S(i) = v(i)IT + f(i)AA¢ and z(i) = (S(i))± 1(r ± m). Upon substituting c = r ± Ad, we Žnd
E((c ± m)¢(c ± m); r, v(i), f(i)) = f(i)tr(A(IT+ 1 ± f(i)A¢(S(i))± 1A)A¢) + (r ± f(i)AA¢z(i) ± m)¢(r ± f(i)AA¢z(i) ± m).

168 Jarl K. Kampen and Tom A. B. Snijders
It follows that the expected value of the conditional log-likelihood is given by
E(,(v,f| y); r, v(i), f(i)) = ± |
21 T ln v± 21 (T + 1) ln f ± 21 v± 1(f(i)tr(A(IT+ 1 ± f(i)A¢(S(i))± 1A)A¢) |
|
+ (r ± |
f(i)AA¢z(i) ± m)¢(r ± f(i)AA¢z(i) ± m)) |
|
± |
21 f± |
1(f(i)(tr(IT+ 1 ± f(i)A¢(S(i))± 1A) + f(i)z(i) ¢A¢Az(i))). |
Because this log-likelihood is a sum of two mutually independent parts, one depending on v only and one depending on f only, the M-step can be divided in two separate parts: one part maximizing with respect to f and another maximizing with respect to v. This yields the two iteration steps reported in Section 3.2.