## Suppose there are two boxes, each containing a mix of red and blue ## balls. We do not know how many balls of each color are contained ## in each box. ## ## Our goal is to estimate the proportion of red balls in the box that ## has more red balls. ## ## First we do a pilot sample by randomly selecting 5 balls from each ## bucket (with replacement). ## ## Then we sample 10 balls (with replacement) from whichever bucket ## produced more red balls in the pilot sample. The proportion of red ## balls in this sample of 10 is our estimate. ## ## Use simulation to calculate the bias, variance, and MSE of this ## estimator. ## ## For the simulation, assume the following: ## Box 1 contains 20 red balls and 10 blue balls. ## Box 2 contains 15 red balls and 15 blue balls. D = NULL nrep = 1e4 for (k in 1:nrep) { ## The numbers of red balls in the two pilot samples. P1 = sum(runif(5) < 2/3) P2 = sum(runif(5) < 1/2) ## The proportion of red balls in the actual sample. if (P1 >= P2) { D[k] = mean(runif(10) < 2/3) } else { D[k] = mean(runif(10) < 1/2) } } bias = mean(D) - 2/3 Var = var(D) MSE = mean( (D-2/3)^2 )