Chapter 1

  1. Dowd (2005) discusses ETL metrics.
  2. Recall that standard deviation is the square root of variance.
  3. Gradient approximations are discussed in Section 2.3.
  4. As obtained with a Monte Carlo transformation.
  5. Some value-at-risk measures make simplifying assumptions that render the value of 0p unnecessary—it drops out of the calculations. Others either accept 0p as an input or calculate it based on current values of key factors.
  6. See Dale (1996) pp. 60 – 61 and Molinari and Kibler (1983) footnote 41.
  7. See Dale (1996), p. 78.
  8. The Basel Committee on Banking Supervision is a standing committee comprising representatives from central banks and regulatory authorities. Over time, the focus of the committee has evolved, embracing initiatives designed to define roles of regulators in cross-jurisdictional situations; ensure that international banks or bank holding companies do not escape comprehensive supervision by some “home” regulatory authority; and promote uniform capital requirements so banks from different countries may compete with one another on a “level playing field.” Although the Basel Committee’s recommendations do not themselves have force of law, G-10 countries have often implemented those recommendations as statutes or regulations.
  9. Personal correspondence with the author.
  10. These value-at-risk measures are described by Chew (1993).
  11. Founded in 1978, the Group of 30 is a nonprofit organization of senior executives, regulators, and academics. Through meetings and publications, it seeks to deepen understanding of international economic and financial issues. Results of the Price Waterhouse study are reported in Group of 30 (1994).
  12. This incident is documented in Shirreff (1992). See Corrigan (1992) for a full text of the speech.
  13. The name “dollars-at-risk” appears as early as Mark (1991) and “capital-at-risk” as early as Wilson (1992).
  14. The above discussion of RiskMetrics is based upon Guldimann (2000), the author’s own recollections, and private correspondence with Till Guldimann.

Chapter 2

  1. It should be apparent from context when parentheses ( ) are being used to indicate an interval as opposed to an ordered pair.
  2. It should be clear from context whether a prime indicates differentiation of a function as opposed to transposition of a vector or matrix.
  3. Our name reflects the method’s similarity to the method of ordinary least squares. See Exercise 2.16.
  4. In [2.61] if b = 0, set b = |b| = 0.
  5. Consider the equation z2 + 4z + 5 = 0. Factoring the left side, we obtain (z + 2 + i)(z + 2 − i) = 0, indicating the two solutions z = −2 − i and z = −2 + i. Now consider the equation z2 − 4z + 10 = 6. Subtracting 6 from both sides and factoring, we obtain (z − 2)(z − 2) = 0. This has two solutions, but they coincide. We say that the equation has the repeated solution z = 2.
  6. If the vertical-bar notation of [2.131] is unfamiliar to you, it is read as “evaluated at”, so the left side of the approximation indicates a derivative evaluated at a specific point x[0].
  7. See Dennis and Schnabel (1983) for a more sophisticated solution.

Chapter 3

  1. For technical reasons, we should qualify [3.2] and say that it may fail to hold on a set of values for X of probability 0.
  2. Technically, f must be measurable for f (X) to be a random variable.
  3. The use of subscripts in the notation η1 and η2 for skewness and kurtosis is unfortunate because it can lead to confusion if subscripts are also employed to distinguish between different random variables. We use the notation because it is well established.
  4. We could force uniqueness by defining the q-quantile as the supremum of all values satisfying the definition provided in the text.
  5. An alternative would be to derive a mean vector based upon interest rate parity.
  6. A set of vectors is orthonormal if they are orthogonal and normalized (e.g. of length 1).
  7. Treatment of the noncentrality parameter is not standardized in the literature. Some authors define the parameter as in [3.114] but denote it simply δ. Others define the parameter differently, for example, taking a square root in [3.114] or dividing the sum by 2.
  8. The gamma function is defined for any y > 0. It is related to the factorial function by Γ(y) = (y – 1)! for y.
  9. See Stoyanov (1997) for more counterexamples relating to the joint-normal distribution.
  10. We discuss random variate generation in Chapter 5.
  11. We lose no generality by assuming Σ is positive-definite. If Σ were singular, we could perform dimensional reduction as described in Section 3.6.1 to obtain a positive-definite joint-normal random vector.
  12. This and the analysis of Exhibit 3.28 were performed with the Monte Carlo method, which we describe in Chapter 5.
  13. See Spanos (1999) for a detailed discussion including historical notes.
  14. I am indebted to Arcady Novosyolov for this simplification.

Chapter 4

  1. See Holton (2004) for an in-depth discussion of subjective vs. objective probabilities in the context of risk.
  2. Usage of the term “sample variance” is inconsistent. Some authors define estimator [4.27] as the sample variance.
  3. We consider only discrete processes. With continuous processes, t takes on real values.
  4. Usage of the term “white noise” is not uniform. Some authors use the term to mean Gaussian white noise.
  5. Such a realization can be constructed using techniques of random variate generation described in Chapter 5.

Chapter 5

  1. Ulam and Teller were fierce rivals during their tenure at Los Alamos. Rota (1987) indicates that Ulam’s significant contribution to the design of the hydrogen bomb resulted coincidentally from his efforts to prove Teller’s design infeasible.
  2. This incident is described in Eckhardt (1987).
  3. W. S. Gossett, who published under the pen name “Student,” randomly sampled from height and middle-finger measurements of 3000 criminals to simulate two correlated normal distributions. He discusses this methodology in both Student (1908a) and Student (1908b).
  4. Laplace had previously described the potential for statistical sampling to approximate solutions to nonrandom problems, including the valuation of definite integrals. See Chapter V of his Théorie Analytique des Probabilités and a 1781 memoir, both available in his published between 1878 and 1912.
  5. See Eckhardt (1987) and Metropolis (1987) for historical accounts of this early work.
  6. Metropolis and Ulam (1949).
  7. Buffon communicated this problem to the Academy in 1733. See Todhunter (1865) for an historical account of Buffon’s work.
  8. See Chapter V of his Théorie Analytique des Probabilités, available in his collected works published between 1878 and 1912.
  9. Fox’s experiment is reported by Hall (1873).
  10. The approximation may be too good. Gridgeman (1960) documents a number of historical implementations of the needle dropping experiment. He includes a statistical analysis of the plausibility of Fox’s reported results.
  11. In constructing a realization of a sample of size m, we have m degrees of freedom. These allow us to simultaneously satisfy m independent conditions. However, the existence of infinitely many possible tests means there are infinitely many conditions to satisfy. With a continuous distribution, infinitely many of the tests will be independent.
  12. The particular generator used in this analysis was the so-called DRAND48 linear congruential generator, which has parameters η = 248, a = 25,214,903,917 and c = 0. We discuss linear congruential generators next.
  13. Lehmer (1951) considers the case c = 0. Obviously, if c = 0 and z[k–1] = 0, then all subsequent values z[k], z[k+1], z[k+2], … will equal 0. This can’t happen as long as 0 is not used as a seed and η is not divisible by a.
  14. This is the name given to the generator in IBM’s System/360 Scientific Subroutine Package, Version III, Programmer’s Manual, 1968.
  15. See, for example, Park and Miller (1988)
  16. More generally, all that is required is that b not be divisible by η.
  17. Knuth (1997, p. 103) indicates that the number of calculations for dimension n is on the order of 3n. Fincke and Pohst (1985) provide a more detailed complexity analysis.
  18. See J. M. Hammersley and D. C. Handscomb (1964). Monte Carlo Methods, New York: John Wiley & Sons., p. 50.

Chapter 6

  1. Time differences reflect periods when daylight savings time is nowhere in effect.
  2. The two exchanges have an offset arrangement that allows an open contract on one exchange to be closed on the other, so the two exchanges’ contracts are truly fungible.
  3. Open outcry trading for both contracts ends at 2:00 PM each day.
  4. Trading closes in Singapore at 7:00 PM local time.
  5. Log returns are used. Data from days when any of the exchanges were closed is omitted from the calculation. Results are inferred using the method of uniformly weighted moving averages (UWMA) discussed in Chapter 7.
  6. The CSCE is a subsidiary of the New York Board of Trade (NYBOT).
  7. A normalized delta is an option’s delta divided by the option’s notional amount. For vanilla options, normalized deltas are between –1 and 1.

Chapter 7

  1. The effect is partially offset if short-term interest rates are more volatile than long-term interest rates.
  2. Exponentially weighted moving average estimation had been used in time series analysis for some time. Zangari’s contribution is to propose its use in value-at-risk analyses.
  3. A histogram of a time series can be treated as a realization of a sample from the unconditional distribution of the underlying stochastic process if the process is strictly stationary.

Chapter 8

  1. Classic papers include French (1980), Gibbons and Hess (1981) and French and Roll (1986).
  2. If there are no intervening nontrading days, a overnight loan is a loan that commences today and matures tomorrow. A tom-next (tomorrow-next) loan commences tomorrow and matures the next day. A spot-next loan commences in 2 days (spot) and matures the next day. Such loans are convenient for extending an existing loan by a day.

Chapter 9

  1. For simplicity, we assume the portfolio is due to receive no fixed cash flows from caplets whose rate-determination dates have already passed.
  2. Notation 0E( ) indicates an expected value conditional on information available at time 0. See Section 0.4. The vertical bar to the right of each partial derivative is read “evaluated at”, so both partial derivatives are “evaluated at 0E(1R)”. See Section 2.2.4.
  3. Any such portfolio would also have exposures to interest rates and implied volatilities. For this example, we treat these as constant.
  4. Formula [9.30] defines an ellipsoid as long as 1|0Σ (and hence 1|0Σ–1) is positive definite.
  5. A trivial solution is to space the points at equal intervals about the equator of the sphere. This works in all cases and is perfectly symmetrical, but it is uninteresting for our purpose.
  6. It is not uniquely defined. Two stable arrangements of 16 electrons are possible. However, one of these has a lower potential energy as defined by [9.35].
  7. The algorithm is not intended to reproduce the exact motion of l – 1 electrons. To the precision dictated by our stopping condition, the result will be a locally minimum-energy configuration.
  8. Corresponding put deltas are –.75, –.50, and –.25.
  9. These may have negative or imaginary values due to roundoff error.
  10. I am indebted to Craig Dibble, formerly of Bankers Trust, for bringing Garbade’s paper to my attention.
  11. If the basis point covariances seem large, remember that they are based on data from the 1980s.
  12. Because all eigenvectors have length 1, it is meaningful to directly compare variances of corresponding principal components.
  13. In practice, we might not apply a principal-component remapping to eliminate just two dimensions. We apply the remapping here for practice.
  14. For expositional convenience, we change our units of measure from the earlier example.

Chapter 10

  1. In Chapter 3, we adopted the inverse CDF notation Φ-1(q) for quantiles. This was because, if a random variable has unique quantiles, they equal corresponding values of the inverse CDF. A random variable has unique quantiles for q ∈ (0,1) if it is continuous with a PDF that is nonzero on some interval (which can be unbounded or all of real numbers) and zero elsewhere. In essentially all value-at-risk applications, random variables 1L and 1P have conditional distributions that satisfy this criterion. Contrived exceptions are possible; consider, a portfolio composed entirely of expiring digital options.
  2. Value-at-risk measures that employ these have sometimes been called delta-gamma value-at-risk measures, reflecting an assumption that the transformation procedure would be proceeded by a quadratic remapping based on delta-gamma approximations. The name is unfortunate because, as explained in Sections 9.3.6 – 9.3.7, quadratic ramappings should be based on less localized approximations, such as those obtained by interpolation or the method of least squares.
  3. To clarify notation, in Section 1.8.1 we mathematically defined a portfolio as an ordered pair (0p, 1P). Accordingly, notation (53600, 1P) tells us that 0p = 53600.
  4. This is worth emphasizing. For a given sample size and value-at-risk metric, standard error depends entirely upon the PDF of 1P. The Monte Carlo transformation procedure works by constructing a realization of a sample for 1P. The actual mechanics of how that realization is constructed are unimportant for standard error. Factors such as the composition of the portfolio, the number of key factors upon which it depends, or the portfolio mapping affect standard error only to the extent that they shape the PDF of 1P. If we know the PDF of 1P, we don’t need to consider these other factors. Understand this, and you will understand why the Monte Carlo method does not suffer from the curse of dimensionality.
  5. Each Monte Carlo analysis was performed with sample size m = 1000. Standard errors for a sample size of m = 20000 were estimated by taking the sample standard deviation of the results and dividing by the square root of 20.

Chapter 11

  1. Allen (1994) and Wilson (1994b) had already described variants of the approach. Also, as part of the public rollout of RiskMetrics, J.P. Morgan distributed a document entitled RiskMetrics – Directory of Products. This listed third-party vendor products that were compatible with RiskMetrics. One of the vendors, Sailfish, was indicated as offering historical simulation in addition to a value-at-risk measure styled on RiskMetrics.
  2. Wilson (1994b).
  3. Heron and Irving (1996).

Chapter 12

  1. Such a crude value-at-risk measure would probably treat implied volatilities as constant. It could capture vega effects by modeling implied volatilities as key factors. In my work, I have come across a number of linear value-at-risk measures inappropriately applied to non-linear portfolios. All lacked the sophistication to model implied volatilities as key factors.

Chapter 14

  1. Source: 2008 phone interview with Till Guldimann.
  2. Values of n greater than 1 generally don’t come into play.
  3. Since a continuous distribution is being used to approximate a discrete one, a case could be made that rounding the lower solution up and the higher one down would be more consistent with [14.8], but we present the test as Kupiec specified it.
  4. They refer readers to Press et al. (1992) for a description of Kuiper’s statistic.
  5. See Berkowitz and O’Brien (2002) and Pérignon, Deng and Wang (2008).