# 7.2 Selecting Key Factors

A judicious choice of the financial variables to be represented with key factors can simplify the task of designing an inference procedure. Considerations are detailed below.

###### 7.2.1 Data availability

We seek a key vector for which there is historical data: {^{–}^{α}** r**, … ,

^{–2}

**,**

*r*^{–1}

**,**

*r*^{0}

**}. We also consider the quality of data that will be available. Issues include:**

*r*- frequency and nature of data errors;
- magnitude and nature of data biases;
- frequency of missing data;
- synchronicity of data;
- nature of prices (transaction, firm, indicative or settlement); and
- future availability of data.

###### 7.2.2 Stationarity and homoskedasticity

Time series modeling is facilitated if stochastic processes are covariance stationary and conditionally homoskedastic. We don’t require that key factors exhibit these properties. If they don’t, it is desirable that they be transformable into related risk factors that do. Given the nature of markets, conditional homoskedasticity may be unachievable.

We may reasonably insist that key factors not exhibit conditional heteroskedasticity that arises from structural causes. Two types of prices are susceptible to such structural conditional heteroskedasticity:

- prices of instruments that mature,
- prices of instruments that have optionality.

Bonds, futures, and options are examples of instruments that mature. As an instrument approaches maturity, its price behavior changes. The result is unconditional heteroskedasticity—and corresponding conditional heteroskedasticity. For example, a bond’s duration declines as it approaches maturity. This causes the standard deviation of its price to diminish.1 It also impacts the degree to which the price correlates with other financial variables. For this reason, we may model constant-maturity interest rates as key factors instead of bond prices.

Option prices experience structural conditional heteroskedasticity related to their nonlinearity. Exhibit 7.2 illustrates the price of a London Metals Exchange (LME) 3-month call option on copper as a function of the underlier price. Because of the “hockey stick” shape of the graph, we expect the option’s price to fluctuate more when it is in-the-money than when it is out-of-the-money. This is confirmed empirically with historical data in Exhibit 7.3.

In Exhibit 7.3, price data is provided for copper and a (constant maturity) 3-month call option on copper struck at USD 2250 per ton. In June 1996, the price of copper fell sharply. The conditional standard deviation of the option’s price diminished as the price of copper fell below the strike price. This is apparent in the option’s P&Ls.

Because of structural conditional heteroskedasticity, option prices are generally not modeled as key factors. If a portfolio holds options, underlier prices and implied volatilities are better behaved as key factors.

###### 7.2.3 Structural relationships

Risk factors are more than disparate random variables. They may exhibit complex relationships that need to be captured in how we characterize their joint probability distribution. Our choice of key factors may facilitate this. Suppose a portfolio is exposed to both the 3-month US Treasury rate and 3-month USD LIBOR. We might model both as key factors, treating each as lognormally distributed. Doing so would not preclude the Treasury rate exceeding the LIBOR rate. An alternative approach that avoids a negative Treasury-Eurodollar (TED) spread is to model the Treasury rate and the TED spread as key factors. Now, if each is lognormally distributed, the TED spread cannot become negative.

###### 7.2.4 Consistency over time

We seek consistency over time in both:

- our set of key factors, and
- the definitions of those key factors.

We don’t want to have to change our key factors to accommodate changes in the portfolio’s composition. Accordingly, we seek a general set of key factors that can remain the same as a portfolio’s composition changes.

We also prefer that definitions for key factors be as consistent as possible over time. Every data series is operationally defined by the specific operations by which it is gathered and/or calculated. Such operational procedures may change over time. The definitions of indices such as the S&P 500 or FT 100 change as specific issues are added or dropped. If a price series is constructed by averaging indicative quotes from five dealers, replacing one of the dealers with a new one represents a change in the operational definition of that risk factor, etc.