4.3.2 Estimators
Represent some observable phenomenon with a random variable X. The distribution of X is known except for the value of some parameter θ. We observe the phenomenon m times, compiling numerical data {x[1], x[2], … , x[m]}, which we treat as a realization of a sample {X[1], X[2], … , X[m]}. We wish to use the data to estimate the parameter θ of the distribution of X.
If we, in some manner, estimate a parameter θ, we obtain a quantity
[4.1]
where the error e is a realization of some random variable E.
Suppose we wish to estimate the mean μ of X. Consider two very different notions:
[4.2]
[4.3]
The first is a number. It is an estimate h for μ. The second is a random variable. It is an estimator H for μ. Formally, an estimator is a function of a sample. Estimates are obtained from estimators by substituting a realization {x[1], x[2], … , x[m]} for the sample {X[1], X[2], … , X[m]}. Estimators are random variables. Estimates are realizations of estimators.
Because estimators are random variables, they have probability distributions. We prefer that an estimator have a mean equal to the parameter being estimated and a standard deviation as small as possible. This leads to the notions of bias and standard error, which we describe shortly. First, let’s consider a category of estimators.
4.3.3 Sample Estimators
Data is often summarized with summary statistics, such as the sample mean. Summary statistics can be used as estimators, in which case they are called sample estimators. Sample estimators for a mean or variance are2
[4.4]
[4.5]
We have already used the sample mean as an estimator in [4.3]. Sample estimators for skewness, kurtosis, quantiles, and other parameters are defined similarly.