# 11.6 Shortcomings of Historical Simulation

In value-at-risk measures that employ a standard Monte Carlo transformation procedure, there is an inference procedure, perhaps using UWMA or EWMA. There is also a procedure that generates a realization {1r[1]1r[2], … , 1r[m]} of a sample for use in the Monte Carlo analysis. That procedure uses a random number generator.

Historical simulation replaces both of those functions with raw historical data. Doing so introduces two problems, which we describe below. Those problems go hand in hand: addressing one tends to exacerbate the other.

###### 11.6.1 Large Standard Errors

Historical simulation is a form of Monte Carlo analysis. As such, it entails standard error. Because a realization {1r[1]1r[2], … , 1r[m]} of a sample for 1R is constructed directly from historical data, sample sizes m tend to be small. This produces large standard errors.

Historical simulation is routinely performed with historical samples of size m = 100. On the other extreme, two years of data might be the most that could reasonably be used. With approximately 250 trading days in a year, that translates into a sample size m of 500. With mirror values, that becomes 1000. In Exhibit 10.5, we assesses what would be the standard error of a Monte Carlo analysis of value-at-risk for several hypothetical portfolios, assuming a sample size of m = 1000. The standard errors vary, but they are routinely around 5% or more. If sample size drops to the more common m = 100, that 5% rises to almost 16%!

With a standard Monte Carlo transformation procedure, value-at-risk can be calculated with a small standard error. This is because the sample size m can be made as large as desired. The only limiting factor is the processing time required to perform the Monte Carlo analysis. Processing time used to be a serious limitation, but as the cost of processing power drops, it is becoming less so. Using variance reduction, or the similar technique of selective valuation of realizations, standard errors can be further cut, usually dramatically. None of these techniques are compatible with historical simulation.

###### 11.6.2 Stale Historical Data

Markets change, sometimes gradually, and sometimes suddenly. Factors that cause them to change include:

• new technology,
• regulatory changes,
• altered perceptions in the wake of a scandal or crisis, or
• economic expansion or decline.

In rare cases, market data that is just a few months old may not be reflective of the same market today. This places historical simulations that use a year or more of historical data at a significant disadvantage compared to other value-at-risk measures: the data is often stale. A solution is to calculate historicalvalue-at-risk using only the most recent data, but doing so exacerbates standard error.

# 11.5 Flawed Arguments for Historical Simulation

Wilson (1994b) offers the sort of argument that was made—and continues to be made—for historical simulation:

The method makes very few assumptions about the market price process generating the portfolio’s returns; it simply assumes that market price changes in the future are drawn from the same empirical distribution as the market price changes generated by the historical data. In fact, by using the empirical distribution, you avoid many of the problems inherent in explicitly modeling the evolution of market prices—for example, the fact that market prices tend to have “fatter tails” and to be slightly more skewed than predicted by the normal distribution, that correlations and volatilities can vary over time and so on. The main advantage of this method is therefore that model risk is not inadvertently introduced into the calculation of capital at risk.

Here there are actually two arguments:

1. That making “assumptions” as part of an analysis detracts from or otherwise renders that analysis suspect.
2. That historical data forms an “empirical distribution” from which the realization 1r of 1R can be assumed to be drawn.

Variants of the first argument arise in different contexts. It is more a debating trick than an argument: If people don’t understand a concept, they may dismiss the concept rather than acknowledge their ignorance. In this case, the concept is probabilistic modeling and the assumptions it entails. Historical simulation is theoretically flimsy precisely because it discards probabilistic modeling in favor of ad hoc, “assumption free”, calculations.

Here’s an example:

Suppose we have ten realizations of some random variable X, and we wish to estimate the .90-quantile of X. Without making any “assumptions”, we might estimate the .90-quantile of X as the value of the lowest of the ten data points.

But suppose further that we know the distribution of X is “bell shaped”. In this case, we might assume X is normal and estimate the .90-quantile of X as the .90-quantile of a normal random variable with mean and standard deviation equal to the sample mean and sample standard deviation of our raw data. Even if the underlying distribution is not exactly normal, modeling it as normal allows us to reflect its bell shape.

In this example, the “assumption” of normality allows us to incorporate into our analysis additional information not contained in the raw data. The result is an improved estimate.

The second argument—that historical data forms an “empirical distribution” from which the realization 1r of 1R can be assumed to be drawn—is also false. That “empirical distribution” would be discrete, with positive probabilities associated only with realizations from {1r[1], 1r[2], … , 1r[m]}. But we know 1R can—and very likely will—take on a realization 1r that is not contained in {1r[1]1r[2], … , 1r[m]}. So an assumption that 1r  will be drawn from the “empirical distribution” is clearly false.

More generally, the argument is undermined by issues we raised in Section 7.4: For calculating value-at-risk, we are interested in the distribution of 1R conditional on information available at time 0. Treated as some “empirical distribution”, historical data—and especially the leptokurtosis it tends to exhibit—is more reflective of the unconditional distribution of 1R. This is a serious problem for all inference procedures, but historical simulation offers no particular advantage over the rest. Any claim that this “empirical distribution” is a superior model for the conditional distribution of 1R is nonsense.

###### Exercises
11.1

Exhibit 11.5 presents 10 pseudorandom IID realizations of a χ2(12, 0) distribution. In this exercise, you will assume you don’t know what distribution the realizations were drawn from. You will compare two approaches for estimating the .90-quantile of the unknown distribution.

realizations
13.0409
11.3813
21.9963
9.5558
9.2049
18.7076
12.4828
13.4726
8.8054
17.2051
Exhibit 11.5: Pseudorandom realizations for use in Exercise 11.1.
1. Estimate the .90-quantile directly from the data by arranging the realizations in descending order of magnitude and estimating the .90-quantile as the mean of the largest two.
2. Suppose that, despite not knowing the underlying distribution, you know it is more-or-less bell-shaped. Based on this information, fit a normal distribution to the data using the sample mean and sample standard deviation. Estimate the .90-quantile of the underlying distribution as the .90-quantile of the normal distribution.
3. The actual .90-quantile of a χ2(12, 0) distribution is 18.55. Compare your results from parts (a) and (b). Which approach produced the better estimate?

# 11.4 Origins of Historical Simulation

Finger (2006) observes

When [bank] risk managers are asked why they opt for historical simulation, they usually respond with one or more of the following:

1. It is easy to explain.
2. It is conservative.
3. It is “assumption-free”.
4. It captures fat tails.
5. It gives me insight into what could go wrong.

As a result of the first two of these, and perhaps the third as well, there is another reason: “my regulator likes it.”

Noticeably absent from the list of reasons is the statement

1. It produces good risk forecasts.

The methodology of historical simulation was already widely familiar when J.P. Morgan publicly launched RiskMetrics in November 1994.1 Bank regulators had already developed a preference for the methodology.2 To understand why, some historical perspective may be helpful.

Within banks, there is a stark distinction between “profit centers” and “cost centers”. Profit centers earn money for a bank, so they command resources and the best employees. Cost centers don’t. During the 1990s, physics Ph.D.s flocked to Wall Street. But they were not put to work developing value-at-risk measures. The over-the-counter derivatives market was growing explosively, and the physicists’ math skills were needed to devise pricing and hedging strategies. Derivatives trading was a profit center. Financial risk management was not. Work on value-at-risk was—not always, but often—assigned to junior analysts or managers whose careers had been sidetracked. Mostly these people lacked quantitative skills. They struggled with concepts such as random vectors, statistical estimators, standard error or variance reduction. But historical simulation was different. It used no sophisticated mathematics. Anyone, so it seemed, could understand and implement the methodology.

A broad literature developed around value-at-risk. This included some outstanding articles—see references cited in this book—but these were the exception. Top academics mostly avoided value-at-risk as a subject for research, so articles and books tended to be written by less capable finance professors or practitioners with limited theoretical grounding. The website gloria-mundi.com is a bibliography for value-at-risk. It lists a staggering volume of items. Among its earlier entries, few are worth reading. A substantial number endorsed historical simulation.

In this context, bank regulators had to approve analytics for banks calculating value-at-risk under the Basel Accords. The regulators tended to have legal or accounting backgrounds, so they too lacked the quantitative skills to understand most value-at-risk measures or to sift through the turgid literature. For them, the “transparency” of historical simulation was appealing. They didn’t rule out other methodologies, but by the mid-1990s, they had wholeheartedly embraced historical simulation.

At the same time, banks were forming risk advisory groups to offer corporate clients free or inexpensive risk management consulting services. The business model was to sell over-the-counter derivatives or other financial services through consultative selling. Several groups offered value-at-risk analytics to complement their consulting. J.P. Morgan had RiskMetrics. CS First Boston offered a package called PrimeRisk. Bankers Trust had RAROC 2020 (risk-adjusted return on capital 2020). Chase Manhattan Bank had CHARISMA (Chase Risk Management Analyzer).3

Chase’s CHARISMA calculated value-at-risk with a crude historical simulation employing just 100 days of historical data. Aggressive marketing of the system closely associated Chase with historical simulation and served to promote the methodology further.

These developments helped spread adoption of historical simulation. More importantly, they contributed to the ongoing acceptance of the methodology by bank regulators. More than anything else, that regulatory acceptance explains the widespread adoption of historical simulation by banks for regulatory puropses.

# 11.2 Generating Realizations Directly From Historical Market Data

Historical simulation requires that historical data {αr, …, –2r, –1r, 0r} be converted into a realization {1r[1], 1r[2], … , 1r[m]} of 1R for use in a crude Monte Carlo analysis. Simplistically, historical values tr might be directly used as values 1r[k] of the realization. Let m = α + 1, and set

[11.1]

for all k.

But this would lead to a host of problems. The most obvious is that the resulting realization {1r[1], 1r[2], … , 1r[m]} may be inconsistent with current market prices, interest rates, spreads, implied volatilities, etc.

For example, suppose a key factor 1R1 represents a stock price whose current value 0r1 is EUR 120. If most of our historical data for the stock is from a period when it was trading near EUR 70, values for the realization would cluster around that EUR 70 price rather than the current EUR 120 price.

Our goal is to construct a realization {1r[1], 1r[2], … , 1r[m]} from historical data {αr, …, –2r, –1r, 0r} in a manner that captures the market dynamics reflected in the individual historical data points, but calibrated to current values 0ri of key factors.

Various approaches are possible. In practice, we employ a two-step process:

1. Apply a linear polynomial to convert the time series of historical data {–αr, … , –2r, –1r, 0r} into a new time series {, … , –2, –1, 0} consistent with a realization of a white noise process . Intuitively, this retains the market dynamics of the original data, abstracted from the actual values tri of key factors when the data was captured.
2. Use another linear polynomial to convert that new time series {, … , –2, –1, 0} into a realization {1r[1], 1r[2], … , 1r[m]} by adjusting each data point to be consistent with current values 0ri of key factors

Construction of the linear polynomial for the first step parallels that of the white noise risk factor mapping we constructed in Section 7.3.3. The linear polynomial takes the form

[11.2]

where tb is a diagonal n matrix, and ta is a vector. We construct [11.2] component by component, depending upon what each component Ri of R represents.

If a component Ri represents a return with 0 conditional mean, we might consider it already a white noises and set

[11.3]

so

[11.4]

tbi,i = 1

[11.5]

tai = 0

If Ri represents a return or spread that can be assumed to have a nonzero conditional mean, it is often reasonable to treat the value t–1ri from the previous period as the conditional mean and subtract it:

[11.6]

so

[11.7]

tbi,i = 1

[11.8]

tai = t –1ri

If Ri represents a price, interest rate, exchange rate, or implied volatility, it is most often reasonable to set

[11.9]

which is a simple return, so

[11.10]

[11.11]

tai = 1

In the vast majority of cases, all components of R can be addressed in one of the above three manners, so we may consider mapping [11.2] specified.

The polynomial for the second step is the inverse of [11.2] but with one change: for each element k, it is applied as of time 0.

[11.12]

This gives us a realization {1r[1], 1r[2], … , 1r[m]} constructed directly from historical data, but consistent with current market conditions.

By composing the two steps [11.2] and [11.12], we combine them into a single step:

[11.13]

###### 11.2.1 Example

Today is January 11, 2002. Let R1 represent the value of 3-month CHF Libor. Exhibit 7.7 indicates 30 days of data, which you used for Exercise 7.2. Let’s use the data to construct an historical realization of a sample for 1R1.

It would be unreasonable to treat R1 as white noise. Its unconditional mean is not 0. Assume t | t–1μ1 = t –1r1 for all t, and transform R1 to white noise 1 with

[11.14]

Applying the transformation to the data of Exhibit 7.7, we obtain white noise data shown in Exhibit 11.1.

Exhibit 11.1: White noise data calculated from the data of Exhibit 7.7 with transformation [11.6].

We apply the inverse transform for time t = 0,

[11.15]

to obtain the historical realization indicated in Exhibit 11.2. A histogram of the historical realization is indicated in Exhibit 11.3.

Exhibit 11.2: Historical realization of a sample for 1R1 for use in a historical simulation. Interest rates are expressed as percentages.
Exhibit 11.3: Histogram of the historical realization. The current CHF Libor rate of 1.673% is indicated in the exhibit with a dark blue triangle.
###### 11.2.2 Mirror Values

Histogram 11.3 highlights a common problem with historical simulation. In the example, we assumed that the expected value of 1R1 equals the current value 0r1, which is 1.673%. However, this is not the sample mean of the historical realization. In the histogram, most values fall to the left of 1.673%. This results from the fact that the historical data from which we derived the realization was from a period during which CHF Libor fell. It started the period at 2.130% and ended the period at the current value of 1.673%. On an average day during that period, Libor fell, and this is reflected in our historical realization.

Holton (1998) describes a solution to this problem whereby “mirror” values are added to the historical realization. For each value of the realization, add another value:

[11.8]

This doubles the size of the sample and ensures a sample mean 1 | 0μ1. The solution assumes that the conditional distribution of 1R1 is symmetric.

We add mirror values to the historical realization of our example. A histogram of the new realization is indicated in Exhibit 11.4.

Exhibit 11.4: Histogram of the new historical realization obtained by adding mirror values to the historical realization depicted in the histogram of Exhibit 11.3.

# 11.3 Calculating Value-at-Risk With Historical Simulation

Historical simulation dispenses with an inference procedure. Or you could say that construction of a realization {1r[1], 1r[2], … , 1r[m]} from historical data is the inference procedure—it characterizes a distribution for 1R, not with some standard joint distribution—or perhaps a mean vector and covariance matrix—but with a realization of a sample. As with any value-at-risk measure, a mapping procedure is required. This may include one or more remappings. The Monte Carlo analysis is performed as in Section 10.2. The only difference is that the realization {1r[1], 1r[2], … , 1r[m]} is constructed as in Section 11.2 above, rather than with standard methods for constructing pseudorandom vectors, as described in Section 5.8.3. A crude Monte Carlo estimator is used because techniques of variance reduction and the method of selective valuation of realizations—both described in Section 10.5—are incompatible with historical simulation.

# 10.6  Further Reading – Transformation Procedures

The mathematics of quadratic transformation procedures is well established except with regard to calculating quantiles of 1P. Two approaches are discussed in this book—the Cornish-Fisher expansion and the trapezoidal rule—but there are others. See, for example, Mina and Ulmer (1999). Jaschke (2001) provides an in-depth discussion of the Cornish-Fisher expansion and its use with quadratic transformations. Jaschke and Mathé (2004) compare two alternative methods, Fourier transforms and a method they propose based on the Monte Carlo method. They find the latter to be superior. Pichler and Selitsch (2000) compare several methods using simulated return data. See also Fuglsbjerg (2000).

Glasserman, Heidelberger and Shahabuddin (2000) describe alternative techniques of variance reduction for value-at-risk. In side-by-side testing, your author found these to be inferior to the methods of Cárdenas et al. (1999) described in Section 10.5. Glasserman (2003) describes other methods of variance reduction, which your author has not tested.

For variance reduction techniques that do not require key factors to be joint normal, see Pupashenko (2014) and Korn and Pupashenko (2015).

# 11.1  Motivation

One of the three “methods” early authors identified for calculating value-at-risk was called historical simulation or historicalvalue-at-risk. A contemporaneous description of historical simulation is provided by Linsmeier and Pearson (1996). Updated to reflect our terminology and notation, it reads:

The distribution of profits and losses is constructed by taking the current portfolio, and subjecting it to the actual changes in the key factors experienced during each of the last α periods … Once the hypothetical mark-to-market profit or loss for each of the last α periods have been calculated, the distribution of profits and losses and the value-at-risk can then be determined.

Stated more formally, historical simulation employs the Monte Carlo method to calculate value-at-risk. But rather than construct a pseudorandom realization {1r[1]1r[2], … , 1r[m]} of a sample for 1R, it constructs {1r[1], 1r[2], … , 1r[m]} directly from historical data for 1R.

In Section 10.4, we discussed transformation procedures that employ the Monte Carlo method with pseudorandom realizations. If a transformation procedure employs the Monte Carlo method with historical realizations, we call it an historical transformation procedure. Historical simulation is then use of an historical transformation procedure to calculate value-at-risk.

Historical simulation is controversial because it is ad hoc. In any situation where it might be applied, a better result can be obtained using a pseudorandom realization {1r[1], 1r[2], … , 1r[m]}, especially if one employs variance reduction or selective valuation of realizations (both discussed in Section 10.5).

So why mention historical simulation? The unfortunate truth is that historical simulation is popular, at least among banks. Pérignon and Smith (2010) report that, of banks that disclosed their methodology for calculating value-at-risk in 2005, 73% used historical simulation. Most of the rest—14%—used value-at-risk measures with Monte Carlo transformation procedures.

In this chapter, we describe how to construct a realization {1r[1]1r[2], … , 1r[m]} from historical data—and how to use it to calculate value-at-risk. We then provide context with a brief history of historical simulation. We review arguments that have been made to support the methodology, and we explain why, not surprisingly, historical simulation is inferior to more standard approaches for calculating value-at-risk.

## 10.5.5 Selective Valuation of Realizations

###### 10.5.5  Selective Valuation of Realizations

The computationally most expensive task in estimating value-at-risk with the Monte Carlo method is performing m valuations 1p[k] = θ(1r[k]). As we have seen, variance reduction can dramatically reduce the number of valuations that must be performed. Cárdenas et al. (1999) propose a complementary technique.

For estimating a standard deviation of 1P, the precise value of every realization 1p[k] is important. Every one contributes to sample standard deviation. For estimating a quantile of 1P, the precise value of only one realization 1p[k] is important—the one equal to the quantile being estimated. Unfortunately, we only find out which one that is after we have valued all the 1p[k]! We can avoid having to value every 1p[k] by employing a quadratic remapping = (1R) to identify realizations 1r[k] for which 1p[k] clearly exceeds the quantile. Since those values 1p[k] are unimportant, we may approximate them with values  = (1r[k]). We now formalize the technique.

Consider a portfolio (0p, 1P) with portfolio mapping 1P = θ(1R), where 1R  Nn(1|0μ, 1|0Σ). We construct a quadratic remapping  = (1R). Set  = 0p. We wish to estimate the portfolio’s q-quantile of loss, which we denote ψ. The corresponding q-quantile of loss for the remapped portfolio is denoted . It is calculated using the methods of Section 10.3.

We stratify into w disjoint subintervals ϑj based upon the conditional PDF of  as follows:

[10.76]

[10.77]

[10.78]

This is illustrated for a hypothetical conditional PDF for  in Exhibit 10.16.

Exhibit 10.16: A stratification of the real numbers into w subintervals as described in the text.

Based upon stratification

[10.79]

define a stratification

[10.80]

where

[10.81]

The 1 – q quantile of  is in Ω1 by construction. Based upon the approximation  ≈ 1P, we expect the 1 – q quantile of 1P to be in Ω1 as well; but we cannot be sure. Generate a realization {1r[1]1r[2], … ,1r[m]} and calculate corresponding values  = (1r[k]). Based upon these, sort the 1r[k] into regions Ω j. For only those in regions Ω1 and Ω2, calculate 1p[k] = θ(1r[k]). For the rest, approximate 1p[k] with . Based upon the (exact or approximate) values 1p[k], estimate the value-at-risk of (0p,1P).

The purpose of the region Ω2 is to play a buffer role to protect against the possibility that the approximation 1P is poor. If the approximation is good, all realized losses 1l [k] = 0p1p[k] for that region should be less than the estimated value-at-risk. If this is the case, then you are done. If not, improve your value-at-risk estimate as follows.

For all realizations 1r[k] in region Ω3, calculate exact portfolio values 1p[k] = θ(1r[k]). Based upon all values 1p[k] (which are now exact in regions Ω1, Ω2, and Ω3 but approximate in the other regions) estimate value-at-risk again. Letting Ω3 play a buffer role, apply the same test as before. If all realized losses for Ω3 are less than the new value-at-risk estimate, you are done. Otherwise, repeat the same procedure again, but with Ω4 playing the buffer role. Continue in this manner until an acceptable value-at-risk estimate is obtained.

Technically, this is not a variance reduction technique, but by dramatically reducing the number of portfolio valuations 1p[k] = θ(1r[k]) that must be performed, it has the same effect.

## Section 10.5.5 Exercise

###### Exercises
10.6

Consider a portfolio (89,700, 1P) with physical and options positions in two underliers whose values are represented by key vector 1R  N2(1|0μ,1|0Σ) where:

[10.82]

Active holdings ω = (800  –300  –100  250) are in four assets, where

[10.83]

All options expire at time 1.

1. Specify a primary mapping 1P = θ(1R).
2. Value 1P at the following nine realizations for 1R (the first equals 1|0μ, and the rest are arranged about an ellipse centered at 1|0μ. They were constructed as described in Section 9.3.8.):

[10.84]

1. Apply the method of least squares to your results from item (b) to construct a quadratic remapping

[10.85]

Weight the realization (1.200, 1.600) five times as heavily as the rest.

1. Construct a scatter plot to assess how well  approximates 1P.
2. Specify a crude Monte Carlo estimator for 0std(1P). Use sample size m = 1000. Estimate 0std(1P).
3. Specify a control variate Monte Carlo estimator for 0std(1P). Use sample size m = 1000 and the fact that 0std() = 38,150. Estimate 0std(1P).
4. Specify a stratified Monte Carlo estimator for 0std(1P). Use sample size m = 1000 and stratification size w = 16. Use values shown in Exhibit 10.17 for the conditional CDF of  (calculated using the methods of Section 10.3) to construct your stratification. Estimate 0std(1P).
Exhibit 10.17: Selected values of the conditional CDF of .
1. Estimate 0std(1P) 10 times using each of your estimators of items (e), (f), and (g). Based upon the results, construct a (very crude) estimate of the standard error of each of the estimators.
2. Based upon the estimated standard errors from item (h), estimate for each of your estimators the sample size required to achieve a 1% standard error.
3. Specify a crude Monte Carlo estimator for the 95%value-at-risk of portfolio (89,700, 1P). Use sample size m = 1000. Estimate thevalue-at-risk.
4. Specify a control variate Monte Carlo estimator for the 95%value-at-risk of portfolio (89,700, 1P). Use sample size m = 1000 and the fact that the .05 quantile of  is 21,770. Estimate thevalue-at-risk.
5. Specify a stratified Monte Carlo estimator for the 95%value-at-risk of portfolio (89,700, 1P). Use sample size m = 1000 and the fact that the .05 quantile of  is 21,770. Estimate thevalue-at-risk.
6. Estimate the 95%value-at-risk of portfolio (89,700, 1P) 10 times using each of your estimators of items (j), (k), and (l). Based upon the results, construct a (very approximate) estimate of the standard error of each of the estimators.
7. Based upon the estimated standard errors from item (m), estimate for each of your estimators the sample size required to achieve a 1% standard error.

## 10.5.4 Stratified Sampling to Calculate Value-at-Risk

###### 10..5.4  Stratified Sampling to Calculate Value-at-Risk

Cárdenas et al. (1999) propose the following method of stratified sampling to calculate value-at-risk. Consider a portfolio (0p, 1P) with portfolio mapping 1P = θ(1R), where 1R  Nn(1|0μ,1|0Σ). We construct a quadratic remapping = (1R) and set  = 0p. We wish to estimate the portfolio’s q-quantile of loss, which we denote ψ. The corresponding q-quantile of loss for the remapped portfolio is denoted . It is calculated using the methods of Section 10.3.

To estimate ψ, we stratify n into two regions, Ω1 and Ω2. Optimally, realizations 1r for which portfolio losses exceed the value-at-risk ψ should fall into one region, with the rest falling into the other region:

• Ω1 = {1r : 0p – θ(1r) ≤ ψ};
• Ω2 = {1r : 0p – θ(1r) > ψ}.

We will explain why this is optimal shortly. For now, we observe that the optimal stratification is impractical. Its definitions of Ω1 and Ω2 depend upon the portfolio’s value-at-risk ψ, which is what we are trying to estimate. As an alternative, we approximate the optimal stratification with one based upon the known value-at-risk of (,):

• Ω1 = {1r :  – (1r) ≤ };
• Ω2 = {1r :  – (1r) > }.

Let’s elaborate. To estimate a quantile of loss (q), it is sufficient to estimate the corresponding quantile (1 – q) of 1P, since

[10.63]

Directly specifying a stratified sampling estimator for values of  is difficult. We focus instead on devising a stratified sampling estimator for values of . If we can estimate (1p) for suitable values 1p based upon a single Monte Carlo analysis, we can estimate the quantile (1 – q) based upon that same analysis.

Define the indicator function

[10.64]

For example, I(x > 3) equals 1 if x = 5 but it equals 0 if x = 2. A crude Monte Carlo estimator for (1p) is

[10.65]

Since I( θ(1R) ≤ 1p) can only take on values 0 or 1, it makes sense to stratify with just two subregions, Ω0 and Ω1, of n such that Ω0 primarily contains values 1r for which the indicator function equals 0, and Ω1 primarily contains values 1r for which the indicator function equals 1.

With such a stratification, define 1R1 = 1R |1R ∈ Ω1 and 1R2 = 1R |1R ∈ Ω2. Our estimator becomes

[10.66]

We equally weight realizations by setting

[10.67]

[10.68]

for a suitable value m. The estimator becomes

[10.69]

This has standard error

[10.70]

If I( θ(1R1) ≤ 1p) always equals 1 and I( θ(1R2) ≤ 1p) always equals 0, the standard error is 0. It is this observation that motivated the optimal stratification described earlier. Because that stratification is impractical, we resort to the related stratification based upon the remapped portfolio. Formally, we stratify into two unbounded intervals:

[10.71]

[10.72]

This is illustrated based upon a hypothetical conditional PDF for  in Exhibit 10.15.

Exhibit 10.15: A stratification of into two intervals.

Based upon stratification

[10.73]

define a stratification

[10.74]

where

[10.75]

To estimate the quantile (1 – q), generate realizations  and . Apply θ to all points  to obtain m = m1 + m2 realizations 1p[k] of 1P. Estimate (1 – q) as that value 1p[k] such that (1 – q)m of the values are less than or equal to it.

## 10.5.3 Stratified Sampling to Calculate Standard Deviation of Loss

###### 10.5.3  Stratified Sampling to Estimate Standard Deviation of Loss

Stratified sampling can also dramatically reduce the standard error of a Monte Carlo estimator for various PMMRs. A quadratic remapping guides us both in specifying a stratification and in selecting sample sizes mi for each subregion of the stratification. Our approach depends upon the PMMR to be estimated. Below, we consider standard deviation of loss and then value-at-risk.>

Consider portfolio (0p, 1P) with 1P = θ(1R) and 1R  Nn(1|0μ,1|0Σ). Construct a quadratic remapping = (1R), and set  = 0p. To estimate 0std(1L), we note

[10.50]

so it is sufficient to estimate 0var(1P). We stratify into w disjoint subintervals ϑj based upon the conditional PDF of  as follows. Since  is a quadratic polynomial of a joint-normal random vector, we may apply the methods of Section 10.3 to calculate its .01 and .99 quantiles, (.01) and (.99). Set

[10.51]

[10.52]

Define intervening subintervals ϑj, each of length

[10.53]

so

[10.54]

A stratification of size w = 8 is illustrated based upon a hypothetical conditional PDF for  in Exhibit 10.14. In practice, stratification sizes of between 15 and 30 may be appropriate.

Exhibit 10.14: A stratification of of size w = 8 constructed as described in the text.

Based upon stratification

[10.55]

define a stratification

[10.56]

where

[10.57]

Specifically, Ω j is the set of realizations 1r for 1R such that corresponding portfolio values 1p = (1r) are in ϑj. In mathematical parlance, each set Ω j is the preimage under  of the set ϑj.

Define w random vectors 1R j = 1R |1∈ Ωj. That is, 1R j equals 1R conditional on 1R being in Ωj. Define 1P j = θ(1R j) for all j. Then 1P is a mixture, in the sense of Section 3.11, of the 1P j. Applying [3.128] and [3.129],

[10.58]

[10.59]

Given samples {, , … , } for the 1R j of respective sizes mj, we define an estimator for 0std(1P):

[10.60]

where

[10.61]

Since  is conditionally a quadratic polynomial of a joint-normal random vector, we can apply the methods of Section 10.3 to value its conditional CDF, so values pj can be calculated. With exceptions of m1 and mw, all mj are set equal to each other. Specifically,

[10.62]

where the mj sum to m. The formula for m1 and mw is reasonable based upon empirical analyses.

Generate realizations  simultaneously for all j by generating realizations 1r[k] for 1R, and allocating each to one of the 1R j according to which set ϑj the corresponding realization (1r[k]) falls in. Stop when you have sufficient realizations for each 1R j. For some 1R j, you will have more than enough realizations, but extras can be discarded.