14.5 Backtesting With Independence Tests

14.5  Backtesting With Independence Tests

Independence tests are a form of backtest that assess some form of independence in a value-at-risk measure’s performance from one period to the next. Independence of exceedances tI and independence of loss quantiles tU are separate forms of independence that might be tested for. We have already seen that coverage tests assume the former and most distribution tests assume the latter. If a value-at-risk measure fails an independence test, that can cast doubt on coverage or distribution backtest results obtained for that value-at-risk measure.

There is no way to directly test for independence, so null hypotheses address specific properties of independence—say exceedances not clustering or loss quantiles not being autocorrelated. Accordingly, backtests for independence can be judged, among other things, based on how broad their null hypotheses are.

14.5.1 Christoffersen’s 1998 Exceedence Independence Test

Christoffersen’s (1998) independence test is a likelihood ratio test that looks for unusually frequent consecutive exceedances—i.e. instances when both t–1i = 1 and ti = 1 for some t. The test is well known, since it was first proposed in an often-cited endorsement of testing for independence of exceedances.

Extending our earlier notation q* for the coverage of a value-at-risk measure, we define



These are the value-at-risk measure’s conditional coverages—its actual probabilities of not experiencing an exceedance given that it did not (in the case of ) or did (in the case of ) experience an exceedance in the previous period. Our null hypothesis script h naught is that q*.

If a value-at-risk measure is observed for α + 1 periods, there will be α pairs of consecutive observations (t–1i, ti). Disaggregate these as


where α00 is the number of pairs (t–1i, ti) of the form (0, 0); α01 is the number of the form (0, 1); etc. We want to test if


which would support our null hypothesis. We apply a likelihood ratio test as follows. Assuming script h naught doesn’t hold, we estimate  and with



Assuming script h naught does hold, we estimate q* with


Our likelihood ratio is



and –2log(Λ) is approximately centrally chi-squared with one degree of freedom—that is –2log(Λ) ~ χ2(1,0)—assuming script h naught. The 0.95 quantile of the χ2(1,0) distribution is 3.841, so we reject script h naught at the .05 significance level if –2log(Λ) ≥ 3.841. Similarly, we reject it at the .01 significance level if –2log(Λ) ≥ 6.635.

The test largely depends on the frequency with which consecutive exceedances are experienced. As these are inherently rare events, the test has limited power. Also, the test isn’t defined when there are no consecutive exceedances at all, which is common. Christoffersen doesn’t address this situation. In some cases it may be reasonable to simply accept the null hypothesis when there are no consecutive exceedances, but not always. For example, if you backtest a one-day 90% value-at-risk measure with 1,000 days of data, there should be about 10 instances of consecutive exceedances. If there are none, it might be inappropriate to accept the null hypothesis.

14.5.2 A Recommended Standard Loss-Quantile Independence Test

For a recommended standard test, we assess the independence of the values tN obtained by applying the inverse standard normal CDF to the loss quantiles tU:


Note that this is the same transformation we made with [14.10]. As before, given loss quantile data mu, m+1u, … , –1u, we apply [14.21] to obtain values mn, m+1n, … , –1n.

We adopt the null hypothesis that the autocorrelations


are all 0 for lags k = 1, 2, 3, 4 and 5. We test this hypothesis by calculating the sample autocorrelations of our data mn, m+1n, … , –1n for those same five lags. We take the maximum of the absolute values of the five sample autocorrelations. That is our test statistic. We reject the null hypothesis at the .05 significance level if the test statistic exceeds the non-rejection value indicated for sample size α + 1 in Exhibit 14.7.

Exhibit 14.7: Non-rejection values for the recommended standard independence test at the .05 and .01 significance levels. If the test statistic exceeds the non-rejection value, the null hypothesis is rejected at the indicated significance level.

Non-rejection values were calculated for each sample size α + 1 with a Monte Carlo analysis that found the 0.95 (for the .05 significance level) or 0.99 (for the .01 significance level) quantile for the test statistic assuming the null hypothesis.


In Christoffersen’s 1998 independence test, α01 routinely equals α10. Why is this, and what would cause them to differ?


A value-at-risk measure is to be backtested using Christoffersen’s 1998 independence test. Based on 250 days of exceedence data, α00 = 237, α01 = α10 = 5, and α11 = 2. Do we reject the value-at-risk measure at the .10 significance level?


A value-at-risk measure is to be backtested using our recommended standard independence test and 500 days of data. Values tn are calculated, and their sample autocorrelations are determined to be 0.034, –0.078, –0.124, 0.107 and 0.029 for lags 1 through 5, respectively. Do we reject the value-at-risk measure at the .05 significance level?