Parameters describe random vectors much as we might use height or age to describe a person. Formally, a parameter is a function that is applied to a random vector’s probability distribution. It may take on real, vector, or matrix values. A standard deviation, mean vector, or covariance matrix are all examples of parameters. In this section, we describe parameters for random variables. In Section 3.4, we extend the discussion to parameters for random vectors.
Let X be a random variable. We denote the expected value, expectation, or mean of X as either μ or E(X). If X is discrete, we define its expectation as
where ϕ is the PF of X. If X is continuous, we replace the summation with an integral and define
where ϕ is the PDF of X.
Expectation is used to define a number of other parameters, but first we must discuss expectations of functions of random variables.
3.3.2 Expectation of a function of a random variable
Suppose X is a random variable and f is a function from to . Then f (X) is a new random variable2 whose probability distribution we can, at least in theory, infer from that of X. We do not need the probability distribution of f (X) in order to determine the expectation E[f (X)]. This can be obtained directly from the probability distribution of X using the formula
depending upon whether X is discrete or continuous. In a sense, [3.5] and [3.6] are generalizations of [3.3] and [3.4].
3.3.3 Variance and standard deviation
Variance is a parameter that measures how dispersed a random variable’s probability distribution is. In Exhibit 3.1, two PDFs have a mean of 0. The one on the left is more dispersed than the one on the right. It has a higher variance.
The variance, denoted α2 or var(X), of a random variable X is defined as an expectation of a function of X:
Standard deviation, denoted σ or std(X), is the positive square-root of variance.
Skew or skewness is a measure of asymmetry in a random variable’s probability distribution. Both PDFs in Exhibit 3.2 have the same mean and standard deviation. The one on the left is positively skewed. The one on the right is negatively skewed.
The skewness of a random variable X is denoted η1 or skew(X). It is defined as
Kurtosis is another parameter that describes the shape of a random variable’s probability distribution. Consider the two PDFs in Exhibit 3.3. Both have a mean and skewness of 0. Which would you say has the greater standard deviation? It is impossible to say. The distribution on the right is more peaked at the center, which might lead us to believe that it has a lower standard deviation. It has fatter tails, which might lead us to believe that it has a greater standard deviation. If the effect of the peakedness exactly offsets that of the fat tails, the two distributions may have the same standard deviation. The different shapes of the two distributions illustrates kurtosis. The distribution on the right has a greater kurtosis than the distribution on the left.
The kurtosis of a random variable X is denoted η2 or kurt(X).3 It is defined as
If a distribution’s kurtosis is greater than 3, it is said to be leptokurtic. If its kurtosis is less than 3, it is said to be platykurtic. Leptokurtosis is associated with distributions that are simultaneously “peaked” and have “fat tails.” Platykurtosis is associated with distributions that are simultaneously less peaked and have thinner tails. In Exhibit 3.3, the distribution on the left is platykurtic. The one on the right is leptokurtic.
Consider a random variable X with CDF Φ. A q-quantile of X is any value x such that Pr(X ≤ x) = q. A q-quantile need not exist. If it does exist, it need not be unique.4 In mostvalue-at-risk applications, all q-quantiles exist and are unique for q ∈ (0,1). In such cases, a q-quantile is a parameter and equals the inverse CDF evaluated at q. For this reason, we denote a q-quantile as Φ–1(q).
The term “percentile” may be used in place of “quantile”. A percentile is a quantile expressed as a percentage. For example, a 95th percentile is a .95 quantile.
For any positive integer k, the kth moment of a random variable X is defined as
Its kth central moment is defined as
where μ = E(X). Based upon our earlier definitions, the expectation and variance of a random variable are its first moment and second central moment. Its skewness and kurtosis are scaled third and fourth central moments.
For any n > 0, a random variable’s first n moments convey the same information as its first n central moments—each can be derived from the other. See Exercise 3.15.
We say a random variable X is bounded if there exists a number a such that Pr(|X| > a) = 0. If a random variable is bounded, all its moments exist. If it is unbounded, specific moments may or may not exist. However, if the kth moment of X exists, then all moments of order less than k must also exist.
PDFs for two continuous random variables are illustrated in Exhibit 3.4. Assume probability density is 0 for both distributions outside the graphed regions. Where possible, indicate which random variable has the greater:
- standard deviation,
- kurtosis, and
Consider a discrete random variable Y, which represents the number of “heads” that will be obtained in three flips of a fair coin. It has PF
- Calculate the mean of Y.
- Calculate the variance of Y.
- Calculate the standard deviation of Y.
- Calculate the skewness of Y.
- Calculate the kurtosis of Y.
- Calculate a .10 quantile of Y.
- Calculate a .875 quantile of Y.
Consider a continuous random variable Z with PDF
- Calculate the mean of Z.
- Calculate the variance of Z.
- Calculate the standard deviation of Z.
- Calculate the skewness of Z.
- Calculate the kurtosis of Z.
- Calculate a .10-quantile of Z.
- Calculate a .875-quantile of Z.
Consider the random variable W = Z 2, where Z is defined as in the previous exercise.
- Calculate the mean of W.
- Calculate the variance of W.
- Calculate the standard deviation of W.
True or false: If a continuous random variable X has a symmetric distribution, ϕ(x – μ) = ϕ(–x – μ), it must have 0 skewness.
In general, for any random variable X and any constant b, E(bX) = bE(X). Prove this result for the case X is discrete.
In general, for any random variable X and any constant a, E(X + a) = E(X) + a. Prove this result for the case X is continuous.
In general, for any random variable X and any constant b, std(bX) = |b|std(X), where |b| indicates the absolute value of b. Prove this result for the case X is discrete. Use your result from Exercise 3.6.
In general, for any random variable X and any constant a, std(X + a) = std(X). Prove this result for the case X is continuous. Use your result from Exercise 3.7.