5.9.1 Control Variates
Variance reduction techniques reduce the standard error of a Monte Carlo estimator by making the estimator more deterministic. The method of control variates accomplishes this with a simple but informative approximation. Consider crude Monte Carlo estimator
[5.55]
for some quantity ψ = E[f (U)], U ~ Un((0,1)n). Let ξ be a function from n to
for which the mean
[5.56]
is known. We shall refer to the random variable ξ(U) as a control variate. Consider the random variable f *(U) based upon this control variate,
[5.57]
for some constant c. This is an unbiased estimator for ψ because
[5.58]
[5.59]
[5.60]
[5.61]
Accordingly, we can estimate ψ with the Monte Carlo estimator
[5.62]
This will have a lower standard error than [5.55] if the standard deviation σ* of f *(U) is smaller than the standard deviation σ of f (U). That will happen if ξ(U) has a high correlation ρ with the random variable f (U), in which case random variables c ξ(U) and f (U) will tend to offset each other in [5.62]. We formalize this observation by calculating
[5.63]
[5.64]
[5.65]
where σξ is the standard deviation of ξ(U). Accordingly, σ* will be smaller than σ if
[5.66]
It can be shown that σ* is minimized by setting
[5.67]
in which case, from [5.65],
[5.68]
Often, ρ and σξ are unknown, which makes determining the optimal value [5.67] for c problematic. We can estimate ρ and σξ with a separate Monte Carlo analysis. Alternatively, if ξ closely approximates f, c might simply be set equal to 1.
5.9.2 Example: control variates
Suppose
[5.69]
and consider the definite integral
[5.70]
This has crude Monte Carlo estimator
[5.71]
We may reduce its standard error with the control variate
[5.72]
Here, ξ is a second-order Taylor approximation for f. We easily obtain the mean μξ of ξ(U):
[5.73]
[5.74]
[5.75]
[5.76]
We set c = 1 and obtain
[5.77]
[5.78]
Our control-variate Monte Carlo estimator is
[5.79]
To compare the performance of this estimator with that of the crude estimator [5.71], we set m = 100 and select a realization for {U[1], U[2], … , U [100]}, which is presented with a histogram in Exhibit 5.13.

We evaluate the functions f and f * at each point. Results are presented as histograms in Exhibit 5.14.

Sample means are 1.427 and 1.430, respectively, so estimators [5.71] and [5.75] have yielded similar estimates for ψ. However, the sample standard deviations are 0.338 and 0.122, respectively, so values f (U[k]) have a substantially higher sample standard deviation than values f *(U[k]). We can use these sample standard deviations to estimate the standard errors of our two estimators. The sample standard deviation 0.338 is an estimate for the standard deviation σ of f (U). The sample standard deviation 0.122 is an estimate for the standard deviation σ* of f *(U). Substituting these into [5.35], we estimate the standard errors of our Monte Carlo estimators as 0.0338 and 0.0122, respectively. Variance reduction using a control variate reduced standard error.
To place this variance reduction in perspective, consider how much we would need to increase the sample size m for crude Monte Carlo estimator [5.71] to achieve the same reduction in standard error. We set [5.35] equal to 0.0122 and substitute our sample standard deviation 0.338 for σ. We obtain
[5.80]
In this particular example, our control variate estimator [5.75] accomplishes with a sample of size 100 what the crude estimator [5.71] would accomplish with a sample of size 768.
In any application, the variance reduction achieved with a particular control variate ξ(U) depends critically upon the correlation ρ between ξ(U) and f (U). Suppose we employ the optimal value for c given by [5.67], and consider the ratio of the standard error for a crude estimator [5.55] with that of a control variate estimator [5.62]. By [5.68], this is
[5.81]
If ξ(U) and f (U) have a correlation of ρ = .9, standard error is reduced by a factor of 1/2.3. If the correlation is .99, that factor increases to 1/7.1. In the example we just presented, the correlation was actually .992, but we only reduced standard error by a factor of 1/2.8 because we set c = 1 instead of determining an optimal value.