Optimal sample allocation in experiments

11 October 2014

One of the recurrent problems I face at work is about *sample allocation*. The
situation is as follows. Given a fixed sample size that is determined by an
external “budget,” I have to perform an experiment to test a standard treatment
against two alternatives to check which one is superior. For instance, we have a
feature in place that I need to test against two possible replacements to know
which one we should use moving forward. Because of that, I am not interested in
the value of each treatment, as much as on the difference between the effect of
two treatments. Although there are occassions in which we have more complex
structures, in most situations I can get away with complete randomization.

The intuitive solution in this case would be to divide $N$, the sample size I am allowed to use, into roughly equal parts. However, in most cases it makes sense for me to attempt to use the sample allocation to optimize over a reasonable quantity, such as the variance of the estimates in which I am interested.

Following Morris (2010), say I am estimating a model such as

where $X$ is a $n\times t$ matrix, in which $X_{ij}$ is an indicator of whether case $i$ has received treatment $j$. The rest of the notation is exactly what you expect. As I said, I am usually less interested in $\tau$ than I am in $C\tau$, where $C$ is a linear transformation. For instance, with two treatments, if I am interested in testing $\tau_1 - \tau_2$ and $\tau_1-\tau_3$, the matrix $C$ would be:

Given that, the variance of the quantities of interest for me is given by

which is the estimate of $\var(\hat \tau)$ that we know from least squares but transformed by $C$.

One reasonable quantity to minimize is thus $ \trace(C (X’ X)^{-1} C’) $, the
*average variance* (after removing constants), where it is convenient to note
that $X’X$ is a square diagonal matrix $t\times t$ with the size
allocated to each treatment (the vector $n$) in the main diagonal. Minimizing
over $n$, subject to the constraint $N=n’ \mathbf{1}$, we can write the
Lagrangian as

to get that the (continuous) number of cases for treatment $i$ is

where $s_i = \{C’C\}_{ii}$, the $i$-th element in the main diagonal of $C’C$.

Now let’s put some `R`

code. Consider, for example, a case with three
alternative treatments aside from the standard treament, in which we want to
test $\tau_1 - \tau_2$, $\tau_1 - \tau_3$, and $\tau_1 - \tau_4$. Let’s first
build the matrix $C$

With a $N=100,$ the optimal sample allocation will be given then by:

which we can round to $n = (37, 21, 21, 21)$. We can now test the average variance produced by this design relative to $n = (25, 25, 25, 25):$

In the simulation, a thousand replications of the experiment are performed and, for each replication, we extract the covariance matrix associated with $C\tau$. We can now calculate the mean in each of the two to check that

where we see that the diagonal elements (all elements, as a matter of fact) of
the design using `N2`

are smaller than with `N1`

.