resample

.htm

Скачиваний:

Добавлен:

23.03.2015

Размер:

10.32 Кб

Скачать

☆

Resampling Resampling methods Bootstrapping the sample median (bootmed1.prg, bootmed2.prg) Two-sample permutation test (cardkrug1.prg, cardkrug2.prg) Bootstrapping the sample median

The following program generates an artificial sample of size 50 from the standard normal and bootstraps the distribution of the sample mean and median. Note that the theoretical standard error of the sample median for the standard normal of size n is Ö (0.5p / n) which is approximately 0.177 for n = 50. By comparison, the standard error of the sample mean is Ö(1 / n) » 0.141. ' bootstrap sample mean and median ( 11/2/99 h) ' series proc version (slower than matrix version) ' last revised 3/28/2004 ' set size of sample !n = 50 ' create workfile wfcreate bootmed1 u 1 !n ' set seed of random number generator rndseed 456 ' generate pseudo-draws from a standard normal series x = nrnd ' set number of bootstrap replications !reps = 10000 ' store means and medians in matrix matrix(!reps,2) out ' bootstrap loop tic for !i = 1 to !reps ' generate bootstrap sample x.resample x_b ' store bootstrap median out(!i,1) = @median(x_b) ' store bootstrap mean out(!i,2) = @mean(x_b) next scalar elp = @toc ' display descriptive statistics show out.stats ' theoretical standard error of median from normal ' se(med) = 1 / ( 2*m*(obs)^0.5 ) ' where m is the median value ' (Kendall-Stuart, 1977, 4th ed, vol 1, p.252) !pi = @acos(-1) scalar se_med = @sqrt(0.5*!pi/!n) scalar se_mea = @sqrt(1/!n) ' convert bootstrap sample into series expand 1 !reps smpl 1 !reps series out_medi series out_mean group gout out_medi out_mean mtos(out, gout) ' display results in table table tab setcolwidth(tab,1,20) tab(1,2) = "median" tab(1,3) = "mean" setline(tab,2) tab(3,1) = "bootstrap mean:" tab(3,2) = @mean(out_medi) tab(3,3) = @mean(out_mean) tab(4,1) = "bootstrap s.d.:" tab(4,2) = @stdev(out_medi) tab(4,3) = @stdev(out_mean) tab(5,1) = "theoretical s.d.:" tab(5,2) = se_med tab(5,3) = se_mea setline(tab,6) tab(7,1) = "sample size = " tab(7,2) = @str(!n) tab(8,1) = "bootstrap replications = " tab(8,2) = @str(!reps) tab(9,1) = "elapsed time = " tab(9,2) = @str(elp) + " seconds" show tab 'get kernel density estimates !n=100 'points to evaluate do out_mean.kdensity(!n,o=dist_mea,b=0.04) do out_medi.kdensity(!n,o=dist_med,b=0.2) smpl 1 !n 'convert to series series x_mean series f_mean group g1 x_mean f_mean mtos(dist_mea, g1) series x_medi series f_medi group g2 x_medi f_medi mtos(dist_med, g2) 'compare bootstrap distribution group g3 g1 g2 ' f_mean x_mean f_medi x_medi freeze(gra1) g3.xyline(b) 'gra1.setelem(1) legend(mean) 'gra1.setelem(2) legend(median) gra1.addtext(t) Kernel density estimates of bootstrap distribution gra1.legend -display gra1.addtext(1.3,0.2) Mean gra1.addtext(0.2,2.2) Median show gra1 The following is the matrix function version of the above program that produces identical results, but takes less time to process. ' bootstrap sample mean and median ' matrix function version (faster than series version) ' 11/2/99 ' checked 3/25/2004 ' create workfile wfcreate bootmedian2 u 1 1 ' set seed of random number generator rndseed 456 ' set size of sample !n = 50 ' generate pseudo-draws from a standard normal matrix(!n,1) x nrnd(x) ' set number of bootstrap replications !reps = 10000 ' store means and medians in vectors vector(!reps,1) out_medi vector(!reps,1) out_mean ' declare matrix to hold bootstrap sample matrix x_b ' bootstrap loop ' reset timer tic for !i = 1 to !reps ' generate bootstrap sample x_b = @resample(x) ' store bootstrap median out_medi(!i) = @median(x_b) ' store bootstrap mean out_mean(!i) = @mean(x_b) next ' store elapsed time in seconds scalar elp = @toc ' theoretical standard error of median from normal ' se(med) = 1 / ( 2*m*(obs)^0.5 ) ' where m is the median value ' (Kendall-Stuart, 1977, 4th ed, vol 1, p.252) !pi = @acos(-1) scalar se_med = @sqrt(0.5*!pi/!n) scalar se_mea = @sqrt(1/!n) ' display results in table table tab setcolwidth(tab,1,20) tab(1,2) = "median" tab(1,3) = "mean" setline(tab,2) tab(3,1) = "bootstrap mean:" tab(3,2) = @mean(out_medi) tab(3,3) = @mean(out_mean) tab(4,1) = "bootstrap s.d.:" tab(4,2) = @stdev(out_medi) tab(4,3) = @stdev(out_mean) tab(5,1) = "theoretical s.d.:" tab(5,2) = se_med tab(5,3) = se_mea setline(tab,6) tab(7,1) = "sample size = " tab(7,2) = @str(!n) tab(8,1) = "bootstrap replications = " tab(8,2) = @str(!reps) tab(9,1) = "elapsed time = " tab(9,2) = @str(elp) + " seconds" show tab ^Top

Two-sample permutation test The following program replicates the permutation test in Johnston-DiNardo (1997, 11.3). The hypothesis under test is whether there is a significant difference in the sample of employment changes between two states. The program constructs a nonparameteric 95% confidence interval of the difference in means using 1000 permutation samples. ' two-sample permutation test ( 11/3/99 ) ' (Johnston-DiNardo 11.2) ' matrix permutation function (fast) ' checked 3/25/2004 ' create workfile wfcreate cardkrug u 1 1 ' input data in matrix (first 33 rows from N.J.) matrix(40) data data.fill -20, -17.5, -13, -12.5, -4.5, -4, -3.5, -2, -1.5, -1, -0.5, -0.5, 0, 0, 0.5, 0.5, 1.5, 2, 2, 2.25, 3, 4.5, 4.5, 5.5, 6, 6.25, 8.25, 9, 10, 10.5, 12, 14.75, 34, -7, -6, -2.5, -0.5, 4, 4.5, 4.5 ' extract submatrices for each sample matrix sub0 = @subextract(data,1,1,33,1) matrix sub1 = @subextract(data,34,1) ' compute difference in means for actual sample scalar mdiff = @mean(sub0) - @mean(sub1) ' set number of replications !reps = 1000 ' set seed of random number generator rndseed 123456 ' declare storage matrices matrix pdata ' permuted data vector(!reps) pdiff ' difference in means from pdata ' permutation loop for !i=1 to !reps ' permute data pdata = @permute(data) ' extract submatrices from permuted data sub0 = @subextract(pdata,1,1,33,1) sub1 = @subextract(pdata,34,1) ' store difference in means pdiff(!i) = @mean(sub0) - @mean(sub1) next ' 95% interval (percentile method) scalar lower = @quantile(pdiff,0.025) scalar upper = @quantile(pdiff,0.975) ' display output in table table out setcolwidth(out,1,30) out(1,1) = "Sample difference in means:" out(1,2) = mdiff out(2,1) = "Permutation confidence interval:" out(2,2) = "[" + @str(lower) + "," + @str(upper) + "]" out(3,1) = "Number of permutations:" out(3,2) = @str(!reps) show out For reference, we also provide a program that uses the group resample procedure. This program also computes the t-statistic for testing the difference of means. ' two-sample t-test ( 11/3/99 ) ' (Johnston-DiNardo 11.2) ' group resample function (slow) ' checked 3/25/2004 ' create workfile wfcreate cardkrug u 1 40 ' input data in matrix ' first 33 from N.J. series x x.fill -20, -17.5, -13, -12.5, -4.5, -4, -3.5, -2, -1.5, -1, -0.5, -0.5, 0, 0, 0.5, 0.5, 1.5, 2, 2, 2.25, 3, 4.5, 4.5, 5.5, 6, 6.25, 8.25, 9, 10, 10.5, 12, 14.75, 34, -7, -6, -2.5, -0.5, 4, 4.5, 4.5 ' create dummy for each subsample series dum = 0 smpl 1 33 dum = 1 smpl @all '---------------------------------------------------------------------------- ' parametric t-test '---------------------------------------------------------------------------- freeze(ttest) x.testby(mean) dum show ttest '---------------------------------------------------------------------------- ' non-parametric permutation test '---------------------------------------------------------------------------- ' compute difference in means for actual sample series mdiff = @mean(x,"@all if dum=1") - @mean(x,"@all if dum=0") scalar dmean = @elem(mdiff,"1") ' set number of replications !reps = 1000 ' set seed of random number generator rndseed 123456 ' declare storage matrices matrix pdata ' permuted data vector(!reps) pdiff ' difference in means from pdata ' permutation loop for !i=1 to !reps ' permute data x.resample(permute) x_b ' compute difference in means mdiff = @mean(x_b,"@all if dum=1") - @mean(x_b,"@all if dum=0") ' store difference in means pdiff(!i) = @elem(mdiff,"1") next ' 95% interval (percentile method) scalar lower = @quantile(pdiff,0.025) scalar upper = @quantile(pdiff,0.975) ' display output in table table out setcolwidth(out,1,30) out(1,1) = "Sample difference in means:" out(1,2) = dmean out(2,1) = "Permutation confidence interval:" out(2,2) = "[" + @str(lower) + "," + @str(upper) + "]" out(3,1) = "Number of permutations:" out(3,2) = @str(!reps) show out ^Top

Соседние файлы в папке docs

#
23.03.2015934 б45mcarlo.htm
#
23.03.201510.66 Кб45mcintro.htm
#
23.03.20152.77 Кб45model.htm
#
23.03.20155.75 Кб46Pindyck_data.htm
#
23.03.20155.25 Кб47Pool.htm
#
23.03.201510.32 Кб45resample.htm
#
23.03.201511.08 Кб45roll.htm
#
23.03.20158.51 Кб45series.htm
#
23.03.20159.98 Кб45sspace.htm
#
23.03.201548.58 Кб45stats.htm
#
23.03.201519.68 Кб46svar.htm