11.5 Flawed Arguments for Historical Simulation

11.5 Flawed Arguments for Historical Simulation

Wilson (1994b) offers the sort of argument that was made—and continues to be made—for historical simulation:

The method makes very few assumptions about the market price process generating the portfolio’s returns; it simply assumes that market price changes in the future are drawn from the same empirical distribution as the market price changes generated by the historical data. In fact, by using the empirical distribution, you avoid many of the problems inherent in explicitly modeling the evolution of market prices—for example, the fact that market prices tend to have “fatter tails” and to be slightly more skewed than predicted by the normal distribution, that correlations and volatilities can vary over time and so on. The main advantage of this method is therefore that model risk is not inadvertently introduced into the calculation of capital at risk.

Here there are actually two arguments:

  1. That making “assumptions” as part of an analysis detracts from or otherwise renders that analysis suspect.
  2. That historical data forms an “empirical distribution” from which the realization 1r of 1R can be assumed to be drawn.

Variants of the first argument arise in different contexts. It is more a debating trick than an argument: If people don’t understand a concept, they may dismiss the concept rather than acknowledge their ignorance. In this case, the concept is probabilistic modeling and the assumptions it entails. Historical simulation is theoretically flimsy precisely because it discards probabilistic modeling in favor of ad hoc, “assumption free”, calculations.

Here’s an example:

Suppose we have ten realizations of some random variable X, and we wish to estimate the .90-quantile of X. Without making any “assumptions”, we might estimate the .90-quantile of X as the value of the lowest of the ten data points.

But suppose further that we know the distribution of X is “bell shaped”. In this case, we might assume X is normal and estimate the .90-quantile of X as the .90-quantile of a normal random variable with mean and standard deviation equal to the sample mean and sample standard deviation of our raw data. Even if the underlying distribution is not exactly normal, modeling it as normal allows us to reflect its bell shape.

In this example, the “assumption” of normality allows us to incorporate into our analysis additional information not contained in the raw data. The result is an improved estimate.

The second argument—that historical data forms an “empirical distribution” from which the realization 1r of 1R can be assumed to be drawn—is also false. That “empirical distribution” would be discrete, with positive probabilities associated only with realizations from {1r[1], 1r[2], … , 1r[m]}. But we know 1R can—and very likely will—take on a realization 1r that is not contained in {1r[1]1r[2], … , 1r[m]}. So an assumption that 1r  will be drawn from the “empirical distribution” is clearly false.

More generally, the argument is undermined by issues we raised in Section 7.4: For calculating value-at-risk, we are interested in the distribution of 1R conditional on information available at time 0. Treated as some “empirical distribution”, historical data—and especially the leptokurtosis it tends to exhibit—is more reflective of the unconditional distribution of 1R. This is a serious problem for all inference procedures, but historical simulation offers no particular advantage over the rest. Any claim that this “empirical distribution” is a superior model for the conditional distribution of 1R is nonsense.


Exhibit 11.5 presents 10 pseudorandom IID realizations of a χ2(12, 0) distribution. In this exercise, you will assume you don’t know what distribution the realizations were drawn from. You will compare two approaches for estimating the .90-quantile of the unknown distribution.

Exhibit 11.5: Pseudorandom realizations for use in Exercise 11.1.
  1. Estimate the .90-quantile directly from the data by arranging the realizations in descending order of magnitude and estimating the .90-quantile as the mean of the largest two.
  2. Suppose that, despite not knowing the underlying distribution, you know it is more-or-less bell-shaped. Based on this information, fit a normal distribution to the data using the sample mean and sample standard deviation. Estimate the .90-quantile of the underlying distribution as the .90-quantile of the normal distribution.
  3. The actual .90-quantile of a χ2(12, 0) distribution is 18.55. Compare your results from parts (a) and (b). Which approach produced the better estimate?