# 14.4 Backtesting With Distribution Tests

As part of the process of calculating a portfolio’s value-at-risk, VaR measures—explicitly or implicitly—characterize a distribution for ^{1}*P* or ^{1}*L*. That characterization takes various forms. A linear VaR measure might specify the distribution for ^{1}*P* with a mean, standard deviation and an assumption that the distribution is normal. A Monte Carlo VaR measure simulates a large number of values for ^{1}*P*. Any histogram of those values can be treated as a discrete approximation to the distribution of ^{1}*P*.

Distribution tests are goodness-of-fit tests that go beyond the specific quantile-of-loss a VaR measure purports to calculate and more fully assess the quality of the ^{1}*P* or ^{1}*L* distributions the VaR measure characterizes.

For example, a crude distribution test can be implemented by performing multiple coverage tests for different quantiles of ^{1}*L*. Suppose a one-day 95% VaR measure is to be backtested. Our basic coverage test is applied to assess how well the VaR measure estimates the 0.95 quantile of ^{1}*L*, but we don’t stop there. We apply the same coverage test to also assess how well the VaR measure estimates the 0.99, 0.975, 0.90, 0.80, 0.70, 0.50 and 0.25 quantiles of ^{1}*L*. Collectively, these analyses provide a rudimentary goodness-of-fit test for how well the VaR measure characterized the overall distribution of ^{1}*L*.

Various distribution tests have been proposed in the literature. Most employ the framework we describe below.

###### 14.4.1 Framework for Distribution Tests

While coverage tests assess a VaR measure’s exceedances data ^{– α}*i*, ^{– α +1}*i*, … , ^{0}*i*, which is a series of 0’s and 1’s, most distribution tests consider loss data ^{– α}*l*, ^{– α +1}*l*, … , ^{0}*l*. Although it is convenient to assume exceedance random variables * ^{t}I* are IID, that assumption is unreasonable for losses

*.*

^{t}LA VaR measure characterizes a CDF for each * ^{t}L*. Treating probabilities as objective for pedagogical purposes, is a forecast distribution we use to model the “true” CDF for each

*, which we denote . Our null hypothesis is then = for all*

^{t}L*t*.

Testing this hypothesis poses a problem: We are not dealing with a single forecast distribution modeling some single “true” distribution. The distribution changes from one day to the next, so each data point * ^{t}l* is drawn from a different probability distribution. This renders statistical analysis futile. We circumvent this problem by introducing a random variable

*for the quantile at which*

^{t}U*occurs.*

^{t}L[14.9]

Assuming our null hypothesis , the * ^{t}U* are all uniformly distributed,

*~*

^{t}U*U*(0,1). We assume the

*are independent. Applying [14.9], we transform our loss data*

^{t}U^{–α}

*l*,

^{–α}

^{+1}

*l*, … ,

^{0}

*l*into loss quantile data

^{–α}

*u*,

^{–α}

^{+1}

*u*, … ,

^{0}

*u*, which we treat as a realization

*u*

^{[0]}, … ,

*u*

^{[α}

^{–1]},

*u*

^{ [α}

^{]}of a sample. This we can test for consistency with a

*U*(0,1) distribution. Crnkovic and Drachman’s (1996) distribution test applied Kuiper’s statistic4 for this purpose.

Some distribution tests—see Berkowitz (2001)—further transform the data ^{– α}*u*, ^{– α +1}*u*, … , ^{0}*u* by applying the inverse standard normal CDF Φ^{–1}:

[14.10]

Assuming our null hypothesis , the * ^{t}N* are identically standard normal,

*~*

^{t}N*N*(0,1), so transformed data

^{–α}

*n*,

^{–α}

^{+1}

*n*, … ,

^{0}

*n*can be tested for consistency with a standard normal distribution.

Below, we introduce a simple graphical test of normality that can be applied. This will motivate a recommended standard test based on Filliben’s (1975) correlation test for normality. That is one of the most powerful tests for normality available.

###### 14.4.2 Graphical Distribution Test

Construct the * ^{t}n* as described above, and arrange them in ascending order. We adjust our notation, denoting

*n*

_{1}the lowest and

*n*

_{α+1}the highest, so

*n*

_{1}≤

*n*

_{2}≤ … ≤

*n*

_{α+1}. Next, define

[14.11]

for *j* = 1, 2, … , α + 1, where Φ is the standard normal CDF. The are quantiles of the standard normal distribution, with a fixed 1/(α + 1) probability between consecutive quantiles. If our null hypothesis holds, and the *n _{j}* are drawn from a standard normal distribution, each

*n*should fall near the corresponding . We can test this by plotting all points (

_{j}*n*, ) in a Cartesian plane. If the points tend to fall near a line with slope one, passing through the origin, this provides visual evidence for our null hypothesis.

_{j}###### 14.4.3 A Recommended Standard Distribution Test

We now introduce a recommended standard distribution test based on Filliben’s correlation test for normality. Construct pairs (*n _{j}*, ) as described above, and take the sample correlation of the

*n*and . Sample correlation values close to one tend to support the null hypothesis.

_{j}Using the Monte Carlo method, we can determine non-rejection values for the sample correlation at various levels of significance. If the sample correlation falls below a non-rejection value, we reject the null hypothesis at the indicated level of significance. Non-rejection values for the .05 and .01 significance levels are indicated in Exhibit 14.6.

Suppose we are backtesting a one-day 99% VaR measure based on α + 1= 250 days of data. We calculate the *n _{j}* and and find their sample correlation to be 0.993. Based on the values in Exhibit 14.6, we reject the VaR measure at the .01 significance level but do not reject it at the .05 significance level.

###### Exercises

Why is it unreasonable to assume losses ^{– α}*L*, ^{– α +1}*L*, … , ^{–1}*L* are IID?

Solution

In applying our recommended standard distribution test with 750 days of data, the sample correlation of the *n _{j}* and is found to be 0.995. Do we reject the VaR measure at the .05 significance level?

Solution