# 14.4 Backtesting With Distribution Tests

As part of the process of calculating a portfolio’s value-at-risk, value-at-risk measures—explicitly or implicitly—characterize a distribution for ^{1}*P* or ^{1}*L*. That characterization takes various forms. A linear value-at-risk measure might specify the distribution for ^{1}*P* with a mean, standard deviation and an assumption that the distribution is normal. A Monte Carlo value-at-risk measure simulates a large number of values for ^{1}*P*. Any histogram of those values can be treated as a discrete approximation to the distribution of ^{1}*P*.

Distribution tests are goodness-of-fit tests that go beyond the specific quantile-of-loss a value-at-risk measure purports to calculate and more fully assess the quality of the ^{1}*P* or ^{1}*L* distributions the value-at-risk measure characterizes.

For example, a crude distribution test can be implemented by performing multiple coverage tests for different quantiles of ^{1}*L*. Suppose a one-day 95% value-at-risk measure is to be backtested. Our basic coverage test is applied to assess how well the value-at-risk measure estimates the 0.95 quantile of ^{1}*L*, but we don’t stop there. We apply the same coverage test to also assess how well the value-at-risk measure estimates the 0.99, 0.975, 0.90, 0.80, 0.70, 0.50 and 0.25 quantiles of ^{1}*L*. Collectively, these analyses provide a rudimentary goodness-of-fit test for how well the value-at-risk measure characterized the overall distribution of ^{1}*L*.

Various distribution tests have been proposed in the literature. Most employ the framework we describe below.

###### 14.4.1 Framework for Distribution Tests

While coverage tests assess a value-at-risk measure’s exceedances data ^{– α}*i*, ^{– α +1}*i*, … , ^{0}*i*, which is a series of 0’s and 1’s, most distribution tests consider loss data ^{– α}*l*, ^{– α +1}*l*, … , ^{0}*l*. Although it is convenient to assume exceedance random variables * ^{t}I* are IID, that assumption is unreasonable for losses

*.*

^{t}LA value-at-risk measure characterizes a CDF for each * ^{t}L*. Treating probabilities as objective for pedagogical purposes, is a forecast distribution we use to model the “true” CDF for each

*, which we denote . Our null hypothesis is then = for all*

^{t}L*t*.

Testing this hypothesis poses a problem: We are not dealing with a single forecast distribution modeling some single “true” distribution. The distribution changes from one day to the next, so each data point * ^{t}l* is drawn from a different probability distribution. This renders statistical analysis futile. We circumvent this problem by introducing a random variable

*for the quantile at which*

^{t}U*occurs.*

^{t}L[14.9]

Assuming our null hypothesis , the * ^{t}U* are all uniformly distributed,

*~*

^{t}U*U*(0,1). We assume the

*are independent. Applying [14.9], we transform our loss data*

^{t}U^{–α}

*l*,

^{–α}

^{+1}

*l*, … ,

^{0}

*l*into loss quantile data

^{–α}

*u*,

^{–α}

^{+1}

*u*, … ,

^{0}

*u*, which we treat as a realization

*u*

^{[0]}, … ,

*u*

^{[α}

^{–1]},

*u*

^{ [α}

^{]}of a sample. This we can test for consistency with a

*U*(0,1) distribution. Crnkovic and Drachman’s (1996) distribution test applied Kuiper’s statistic4 for this purpose.

Some distribution tests—see Berkowitz (2001)—further transform the data ^{– α}*u*, ^{– α +1}*u*, … , ^{0}*u* by applying the inverse standard normal CDF Φ^{–1}:

[14.10]

Assuming our null hypothesis , the * ^{t}N* are identically standard normal,

*~*

^{t}N*N*(0,1), so transformed data

^{–α}

*n*,

^{–α}

^{+1}

*n*, … ,

^{0}

*n*can be tested for consistency with a standard normal distribution.

Below, we introduce a simple graphical test of normality that can be applied. This will motivate a recommended standard test based on Filliben’s (1975) correlation test for normality. That is one of the most powerful tests for normality available.

###### 14.4.2 Graphical Distribution Test

Construct the * ^{t}n* as described above, and arrange them in ascending order. We adjust our notation, denoting

*n*

_{1}the lowest and

*n*

_{α+1}the highest, so

*n*

_{1}≤

*n*

_{2}≤ … ≤

*n*

_{α+1}. Next, define

[14.11]

for *j* = 1, 2, … , α + 1, where Φ is the standard normal CDF. The are quantiles of the standard normal distribution, with a fixed 1/(α + 1) probability between consecutive quantiles. If our null hypothesis holds, and the *n _{j}* are drawn from a standard normal distribution, each

*n*should fall near the corresponding . We can test this by plotting all points (

_{j}*n*, ) in a Cartesian plane. If the points tend to fall near a line with slope one, passing through the origin, this provides visual evidence for our null hypothesis.

_{j}###### 14.4.3 A Recommended Standard Distribution Test

We now introduce a recommended standard distribution test based on Filliben’s correlation test for normality. Construct pairs (*n _{j}*, ) as described above, and take the sample correlation of the

*n*and . Sample correlation values close to one tend to support the null hypothesis.

_{j}Using the Monte Carlo method, we can determine non-rejection values for the sample correlation at various levels of significance. If the sample correlation falls below a non-rejection value, we reject the null hypothesis at the indicated level of significance. Non-rejection values for the .05 and .01 significance levels are indicated in Exhibit 14.6.

Suppose we are backtesting a one-day 99% value-at-risk measure based on α + 1= 250 days of data. We calculate the *n _{j}* and and find their sample correlation to be 0.993. Based on the values in Exhibit 14.6, we reject the value-at-risk measure at the .01 significance level but do not reject it at the .05 significance level.

###### Exercises

Why is it unreasonable to assume losses ^{– α}*L*, ^{– α +1}*L*, … , ^{–1}*L* are IID?

Solution

In applying our recommended standard distribution test with 750 days of data, the sample correlation of the *n _{j}* and is found to be 0.995. Do we reject the value-at-risk measure at the .05 significance level?

Solution