5.7 Breaking the Curse of Dimensionality

5.7  Breaking the Curse of Dimensionality

The Monte Carlo method is an application of statistical inference. As such, it entails the familiar standard error associated with all statistical inferences. So far, we have discussed the Monte Carlo method in a largely intuitive manner. If we are to assess its standard error or discuss techniques of reducing this error, we need to formalize the Monte Carlo method.

In Section 5.2, we considered two examples of the Monte Carlo method. Estimating π is a deterministic problem. Estimating the variance of a random variable Y is probabilistic. Although statistical in nature, the Monte Carlo method was just as applicable to the deterministic problem as it was to the probabilistic one. Let’s take another look at those two examples. Observe that both may be represented as definite integrals. Fox’s needle dropping was an elaborate means of estimating the integral

[5.27]

Our estimation of the variance σ2 of the random variable Y was actually an estimate of the definite integral

[5.28]

5.7.1 Crude Monte Carlo Estimator

All Monte Carlo analyses entail selecting a realization {x[1], x[2], … , x[m]} of some sample {X[1], X[2], … , X[m]} and applying some function to estimate some quantity ψ. Accordingly, a Monte Carlo analysis represents a statistical estimator:

[5.29]

which we call a Monte Carlo estimator. Hammersley and Handscomb (1964)18 provide an interesting proof that, at least in a trivial sense, all Monte Carlo computations can be viewed as approximations to definite integrals. This result is too general for practical work, but it motivates us to explore the Monte Carlo method from the standpoint of numerical integration. Suppose we need to evaluate a multidimensional definite integral of the form

[5.30]

where f is bounded on the region of integration. In practical applications, integrals can usually be converted to this form through a suitable change of variables.

This is a nonrandom problem, but the Monte Carlo method approximates a solution by introducing a random vector U that is Un((0,1)n). Applying the function f to U, we obtain a random variable f(U). By [3.16] its expectation is

[5.31]

Comparing this expression for E[f(U)] with our original integral [5.30], we obtain a probabilistic expression for that non-probabilistic integral,

[5.32]

so random variable f(U) has mean ψ and some standard deviation σ. We define

[5.33]

as an unbiased estimator for ψ with standard error σ. This is a little unconventional, since [5.33] is an estimator that depends upon a sample {U[1]} of size 1, but it is a valid estimator nonetheless.

To estimate ψ with a standard error lower than σ, let’s generalize our estimator to accommodate a larger sample {U[1], U [2], … , U [m]}. Applying the function f to each of these yields m IID random variables, f(U [1]), f(U [2]), … , f(U [m]), each with expectation ψ and standard deviation σ. By [3.27] and [3.28], the generalization of [5.33]

[5.34]

is an unbiased estimator for ψ with standard error

[5.35]

If we have a realization {u[1], u[2], … , u[m]} for our sample, we may estimate ψ as

[5.36]

We call [5.34] the crude Monte Carlo estimator for ψ.

5.7.2 Implications of standard error

It is important to distinguish between the error

[5.37]

and standard error

[5.38]

of the crude Monte Carlo estimator H. The former is a random variable. The latter is a parameter of that random variable. Since

[5.39]

standard error is the standard deviation of error.

Formula [5.38] for standard error is important because neither its numerator σ nor its denominator  depends upon the dimensionality n of the integral ψ. The crude Monte Carlo estimator is a technique of numerical integration that is not subject to the curse of dimensionality.

The formula for standard error places no bounds on error. It only describes error probabilistically with a standard deviation. For a large sample size m, the central limit theorem applies, so error will be approximately normally distributed with mean 0 and standard deviation equal to the standard error. The absolute value of error will exceed standard error approximately 32% of the time. It will exceed twice the standard error approximately 4.5% of the time.

Exercises
5.2

A crude Monte Carlo estimator of some quantity ψ has a standard error of 7.5 when a sample size m = 1000 is used. What sample size should be used to achieve a standard error of 2?
Solution

5.3

Consider the definite integral

[5.40]

  1. Evaluate the integral analytically.
  2. Use change of variables formula [2.166] and change of variables

    [5.41]

    to convert the integral to form [5.30].
  3. Estimate the integral from part (b) with a crude Monte Carlo estimator four times, using samples of size 1, 10, 100, and 1000.
  4. Estimate the standard errors of your four estimators in parts (c).

Solution

5.4

Consider the definite integral

[5.42]

  1. Evaluate the integral analytically.
  2. Implement a change of variables u for x to convert the integral to form [5.30].
  3. Estimate the integral from part (b) with a crude Monte Carlo estimator four times, using samples of size 1, 10, 100, and 1000.
  4. Estimate the standard errors of your four estimators in parts (c).

Solution