Mathematics

Estimator Bias

Estimator bias refers to the tendency of an estimator to consistently overestimate or underestimate the true value of a parameter. This can occur due to various factors such as sampling error, measurement error, or the use of an inappropriate model. Estimator bias can lead to inaccurate conclusions and should be minimized or corrected for in statistical analyses.

Written by Perlego with AI-assistance

5 Key excerpts on "Estimator Bias"

  • Book cover image for: Understanding Advanced Statistical Methods
    Now, back to the story. You want your estimate of the parameter to be close to the actual value of the parameter. In other words, using the new vocabulary, you want your estimate to be close to the estimand . Because your estimate is the result of a random process ( model produces data ), it can sometimes be close to the estimand, and sometimes far away. Thus, you need to understand how the random estimator behaves in relation to the fixed esti-mand in order to understand whether you can trust the estimate. The behavior of the random estimator is the topic of this chapter; the chapter topics unbiasedness, consistency, and efficiency all refer to the randomness of the estimator. They all sound good, right? Unbiasedness sounds great—who wants to be biased? Ditto with consistency and efficiency—who wants to be inconsistent or inefficient? Many statistical formulas embedded into statistical software are mysterious-looking because 284 Understanding Advanced Statistical Methods they contain bias corrections; thus, the concept of bias is fundamental to understanding statistical theory and practice. Ironically, for all the effort at producing bias-corrected esti-mators, it turns out that unbiasedness is relatively unimportant compared to consistency and efficiency. 11.2 Biased and Unbiased Estimators Terms with a hat ∧ on them will sometimes be random in this chapter; if so they are called estimators . As random variables, it makes sense to talk about their distributions— the distribution of possible values that the estimator might take on—as well as the expected values, or points of balance, of such distributions. Once an estimator becomes an estimate , there is no distribution, since there is only one number (like 76.1) and hence no expected value of interest either. (The expected value of 76.1 is, well, 76.1. It’s not interesting.) This notion of randomness of an estimator is essential to understanding the concept of unbiasedness.
  • Book cover image for: Inference Principles for Biostatisticians
    When E ( ˆ Θ )= θ we say that ˆ Θ is an unbi-ased estimator of θ . If an estimator is unbiased then it has no systematic tendency to under-or over-estimate the unknown parameter value. If it is not unbiased then we refer to it as a biased estimator . The quantity Bias ( ˆ Θ )= E ( ˆ Θ ) − θ is referred to as the bias of the estimator. Thus, unbiased estimators have a bias of zero. While unbiasedness is clearly a desirable property of estimators, in practice we often tolerate estimators that have a small bias, particularly if they have other desirable properties. The property of unbiasedness is concerned with the average value of the estima-tor. A second property of interest is the variability of the estimator, which can be quantified by its variance. In fact, for reasons that will become apparent when we talk about confidence intervals later in the chapter, we often focus on the square root of an estimator’s variance, called the standard error SE ( ˆ Θ )= radicalBig Var ( ˆ Θ ) . A desirable property of an estimator is that it has a “small” standard error, as this in-dicates that the estimator does not vary very much. In particular, if a small standard error is combined with unbiasedness then the estimator will tend to take values that are close to θ . If we had to make a choice between two competing unbiased estima-tors in order to calculate an estimate of θ , then we would choose the one with the smallest variance, or equivalently the smallest standard error. This suggests a numer-ical way of comparing different unbiased estimators. Given two different unbiased estimators of θ , ˆ Θ 1 and ˆ Θ 2 , the efficiency of ˆ Θ 1 relative to ˆ Θ 2 is Eff ( ˆ Θ 1 , ˆ Θ 2 )= Var ( ˆ Θ 2 ) Var ( ˆ Θ 1 ) . Thus, if Eff ( ˆ Θ 1 , ˆ Θ 2 ) > 1 then ˆ Θ 1 would be preferred to ˆ Θ 2 . An important result in theoretical statistics tells us that there is a limit to how well an unbiased estimator can perform in terms of variability.
  • Book cover image for: Probability and Statistics with R
    • Maria Dolores Ugarte, Ana F. Militino, Alan T. Arnholt(Authors)
    • 2015(Publication Date)
    7.2.1 Mean Square Error The desirability of an estimator is related to how close its estimates are to the true parameter. The di ↵ erence between an estimator T for an unknown parameter ✓ and the parameter ✓ itself is called the error. Since this quantity can be either positive or negative, it is common to square the error so that various estimators T 1 , T 2 , . . . , can be compared using a non-negative measure of error. To that end, the mean square error of an estimator, denoted MSE [ T ], is defined as MSE [ T ] = E ⇥ ( T -✓ ) 2 ⇤ . Estimators with small MSE s will have a distribution such that the values in the distribution will be close to the true parameter. In fact, the MSE consists of two non-negative components, the variance of the estimator T , defined as V ar [ T ] = E ⇥ ( T -E [ T ]) 2 ⇤ , and the squared bias of the estimator T , 405 406 Probability and Statistics with R , Second Edition where bias is defined as E [ T ] -✓ since MSE [ T ] = E ⇥ ( T -E [ T ] + E [ T ] -✓ ) 2 ⇤ = E ⇥ ( T -E [ T ]) 2 ⇤ + E ⇥ ( E [ T ] -✓ ) 2 ⇤ + 2 E ⇥ ( T -E [ T ])( E [ T ] -✓ ) ⇤ = Var [ T ] + ( E [ T ] -✓ ) 2 + 2( E [ T ] -E [ T ])( E [ T ] -✓ ) = Var [ T ] + ( E [ T ] -✓ ) 2 = Var [ T ] + ( Bias [ T ]) 2 . (7.1) The concepts of variance and bias are illustrated in Figure 7.1, which depicts the shot patterns for four marksmen on their respective targets. When the marksman’s weapon is properly sighted, the center of the target represents ✓ . Low Variance, Low Bias Low Variance, High Bias High Variance, Low Bias High Variance, High Bias FIGURE 7.1: Visual representations of variance and bias It seems logical to think that the most desirable estimators are those that minimize the MSE . However, estimators that minimize the MSE for all possible values of ✓ do not always exist. In other words, an estimator may have the minimum MSE for some values of ✓ and not others.
  • Book cover image for: Subset Selection in Regression
    For the relatively small number (usually only two) of alternatives considered, explicit expressions can be derived for the biases in the regression coefficients, and hence the effect of these biases on prediction errors can be derived. There is almost no consideration of alternative estimates of regression coefficients other than ridge regression and the James-Stein/ Sclove estimators. Useful references on this topic are the surveys by Bancroft and Han (1977) and Wallace (1977), and the book by Judge and Bock (1978). Though much of this chapter is concerned with understanding and trying to reduce bias, it should not be construed that bias elimination is a necessary or even a desirable objective. We have already discussed some biased estima-tors such as ridge estimators. Biased estimators are a standard part of the statistician’s toolkit. How many readers use unbiased estimates of standard deviations? (N.B. The usual estimator s = { ∑ ( x − ¯ x ) 2 / ( n − 1) } 1 2 is biased, though most of the bias can be removed by using ( n − 1 . 5) instead of the ( n − 1)). That ( n − 1) is used to give an unbiased estimate of the variance . It is important that we are aware of biases and have some idea of their magnitude. This is particularly true in the case of the selection of subsets of regression variables when the biases can be substantial when there are many subsets that are close competitors for selection. 6.2 Choice between two variables To illustrate the bias resulting from selection, let us consider a simple exam-ple in which only two predictor variables, X 1 and X 2 , are available and it has been decided a priori to select only one of them, the one that gives the smaller residual sum of squares when fitted to a set of data. For this case, it is feasible to derive mathematically the properties of the least-squares (or other) estimate of the regression coefficient for the selected variable, the residual sum
  • Book cover image for: Applied Statistics for Business and Economics
    • Robert M. Leekley(Author)
    • 2010(Publication Date)
    • CRC Press
      (Publisher)
    7.1.1 Qualities of a Good Point Estimator There are a number of criteria for evaluating estimators. We will consider just two: unbiasedness and efficiency. 152 Applied Statistics for Business and Economics 7.1.1.1 Unbiasedness A statistic is an unbiased estimator of a parameter if its expected value equals that parameter. There is, then, no systematic tendency for the sta-tistic to be too high or low. Recall from Chapter 6 that the sampling distribution of p X centered on π X , and the sampling distribution of X – centered on μ X . The average, or expected value, of p X is π X ; the average, or expected value, of X – is μ X . Both p X and X – are unbiased estimators of their respective parameters. In some cases, though, we need to calculate a sample statistic in a non-intuitive way in order for it to be unbiased. Recall that when we calculate the sample standard deviation, s X , we divide by n – 1, instead of the more intuitively-appealing n , to avoid it being too small on average. Dividing by n would have made s X a biased estimator of σ X ; dividing by n – 1 makes it unbiased. 7.1.1.2 Efficiency There is often more than one unbiased estimator of a parameter. When there is, the best is the one that is most efficient. Consider perhaps a silly example. We could estimate μ X by taking just the first case in a sample, X 1 , and ignoring all the rest. Since there is no tendency for the first case to be too high or low, X 1 would be, like X – , an unbiased estimator of μ X . But, clearly, we would not be using the sample information available to us efficiently. The X 1 would have a larger standard error than X – . Its sampling distribution would be wider. The chance would be greater of it giving us a misleading estimate of μ X . We want the statistic with the smallest standard error, hence the narrowest sampling distribution. This reduces, as much as possible, the chance of getting a sample value that is a misleading estimate of the parameter.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.