6.5 Data Biases

6.5  Data Biases

Data biases are distortions in data that fall short of being data errors. As market theoreticians, we like to think of historical prices as forming an unbroken series of numbers, each representing the price at which an efficient market cleared at a particular point in time. Data may contain errors, but we filter and clean it to eliminate these.

As practitioners, we accept a different view. While certain data values are clearly erroneous, others are not so easily categorized. From an operational standpoint, data is erroneous if our filtering procedure identifies it as such. Otherwise it is correct. Subjectively, things are not so stark. We perceive a continuum of gradations between correct and erroneous data. A filtering procedure discards certain values as erroneous, but the remainder may reflect a variety of modest data biases or distortions. Isolated distortions, so long as they are minor, have little effect on value-at-risk measurements. More problematic are data biases of a systematic nature.

In any market, there are bid-ask spreads. These introduce a bias into transaction prices—biasing them upward/downward when a market maker is selling/buying. Such data biases are inconsequential in liquid markets if daily price fluctuations dwarf bid-ask spreads. They are more problematic in illiquid markets, where bid-ask spreads can be large.

Transaction costs reduce arbitrage opportunities, which increases the potential for modest data biases or discrepancies to persist in markets.

If an asset is credit sensitive, different transaction prices may reflect differing counterparty credit qualities or differing collateralization arrangements. Even if credit quality is not an issue, liquidity effects can introduce data biases. Two bonds may be issued by similar credits—or both by the same credit—but one trades at a premium to the other because it has a more active secondary market. This is evident in the US Treasury bond market, where the most recently issued “on the run” bonds trade at a premium.

Settlement options introduce data biases in physical commodity and energy markets, as prices may vary depending upon which settlement method is selected. Standard methods include:

  • free on board (FOB)—the product is cleared through customs and delivered on board the recipient’s ship at the port of departure.
  • free alongside (FAS)—FOB delivery, except the product is delivered alongside the recipient’s ship.
  • cost, insurance, freight (CIF)—FOB delivery as well as insurance and shipping. Essentially, the commodity is delivered to the destination port, but actual settlement is through delivery of the ship’s bill of lading.
  • ex-dock—the commodity is cleared through customs and delivered on the dock of the destination port.
  • in-warehouse—the commodity is cleared through customs and delivered in a warehouse at the destination port.
  • ex-warehouse—the delivering party provides in-warehouse delivery and pays the cost of moving the commodity to the warehouse exit.

In addition, transaction prices may reflect additional services such as storage, inventory management, or balancing fees. Payment terms may also influence prices.

The data biases described above are all associated with transaction prices. As an alternative, firm or indicative prices may be available reflecting some standard settlement and payment terms, and mid-market prices can be calculated from these. Another solution is to apply standard adjustments to transaction prices to make them comparable. Prices reflecting different settlement methods might be adjusted to make them all consistent with, say, ex-dock settlement. If a market is active, it may be reasonable to use some average of prices, say recording each day’s closing price as the average price of all transactions completed during the last minute of trading. Individual prices may reflect specific data biases, but these will cancel somewhat in the averaged price.

Substituting the price of one asset for that of a closely related asset can introduce data biases. Forward and future prices for the same underlier often move in tandem, especially if they have identical or similar settlement terms. However, they can diverge. One reason is market segmentation. If large transactions that would move the futures market are always transacted on the forward market, arbitraging between the two markets may be infeasible. Another reason is a phenomenon that has come to be known as convexity bias.

Cox, Ingersoll and Ross (1981) and Jarrow and Oldfield (1981) suggest that daily margin payments on futures may cause forward and futures prices to diverge. If there is a correlation between daily futures prices and interest rates, one party to a futures contract will tend to receive margin payments on days when interest rates rise and make margin payments on days when interest rates decline. On average, she can expect to invest the margin payments she receives at interest rates that are higher than those at which she finances the margin payments she makes. The other party can expect the opposite experience. This should cause a divergence in forward and futures prices, with the effect depending upon the maturity of contracts, the magnitude of correlations, and the volatility of the future’s prices.

Empirical studies by Cornell and Reinganum (1981), French (1983), and Park and Chen (1985) confirm a modest convexity bias in gold, silver, silver coins, platinum, copper, and plywood prices, but fail to find one for various currencies.

As we might expect, the effect is most pronounced when a future’s underlier is an interest rate or fixed-income instrument that exhibits a high correlation with applicable interest rates. In this context, Burghardt and Hoskins (1995) coined the name “convexity bias.” They and Gupta and Subrahmanyam (2000) discuss the convexity bias in the pricing of interest-rate swaps. The swap market traditionally priced swaps directly off Eurocurrency futures without recognizing any convexity bias, but this started to change in the early 1990s. Today, swap prices reflect a significant convexity bias relative to Eurocurrency futures.

Taxes may also introduce data biases. Discrepancies can exist across jurisdictions or within a jurisdiction.

Between 1953 and 1963, the US Treasury issued a number of Treasury bonds with a special feature. If tendered as payment of federal estate taxes, the bonds would be valued at par, irrespective of their current market value. Because of their association with estate taxes and funerals, the bonds came to be called “flower bonds.” When interest rates rose during the 1970s and 1980s, flower bonds traded at a premium. The last flower bond had a coupon of 3.5% and matured in 1998.

Many tax-related data biases are unintentional. If they offer opportunities for tax arbitrage, they are likely to soon be legislated out of existence. Cornell (1981) and Viswanath (1989) describe one such effect that existed until 1981 in the US Treasury bill futures market.