Mathematics

Variance for Binomial Distribution

The variance for a binomial distribution measures the spread or dispersion of the distribution. It quantifies how much the values in the distribution deviate from the mean. In the context of a binomial distribution, the variance is calculated using the formula np(1-p), where n is the number of trials and p is the probability of success on each trial.

Written by Perlego with AI-assistance

11 Key excerpts on "Variance for Binomial Distribution"

  • Book cover image for: Introduction to Mean and its Applications in Mathematics & Statistics
    Variance In probability theory and statistics, the variance is used as one of several descriptors of a distribution. It describes how far values lie from the mean. In particular, the variance is one of the moments of a distribution. In that context, it forms part of a systematic approach to distinguishing between probability distributions. While other such approaches have been developed, those based on moments are advantageous in terms of mathematical and computational simplicity. The variance is a parameter describing a theoretical probability distribution, while a sample of data from such a distribution can be used to construct an estimate of this variance: in the simplest cases this estimate can be the sample variance . Background The variance of a random variable or distribution is the expectation, or mean, of the squared deviation of that variable from its expected value or mean. Thus the variance is a measure of the amount of variation within the values of that variable, taking account of all possible values and their probabilities or weightings (not just the extremes which give the range). For example, a perfect die, when thrown, has expected value of ________________________ WORLD TECHNOLOGIES ________________________ (1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5, expected absolute deviation [the mean of the equally likely absolute deviations] (3.5 − 1, 3.5 − 2, 3.5 − 3, 4 − 3.5, 5 − 3.5, 6 − 3.5), producing (2.5 + 1.5 + 0.5 + 0.5 + 1.5 + 2.5) / 6 = 1.5, but expected square deviation or variance [the mean of the equally likely squared deviations] of (2.5 2 + 1.5 2 + 0.5 2 + 0.5 2 + 1.5 2 + 2.5 2 ) / 6 = 17.5/6 ≈ 2.9. As another example, if a coin is tossed twice, the number of heads is: 0 with probability 0.25, 1 with probability 0.5 and 2 with probability 0.25. Thus the variance is 0.25 × (0 − 1) 2 + 0.5 × (1 − 1) 2 + 0.25 × (2 − 1) 2 = 0.25 + 0 + 0.25 = 0.5.
  • Book cover image for: Statistics for the Behavioural Sciences
    eBook - ePub
    p = 0.75 (negatively skewed). The scale on the vertical axis reports probabilities.

    Mean and variance of the binomial distribution

    It is possible to obtain, for any binomial distribution, the mean number of successes, their variance and standard deviation, using the following formulae:
    where X is a random variable with a binomial distribution, n is the total number of Bernoulli trials, and p is the probability of occurrence of a success on each trial.
    If you are curious to know how the above formulae are obtained, then read the following argument. First, let Sn be the number of successes out of n Bernoulli trials with probability p of success on each trial. Remember that the plotting of the probability of occurrence of r successes out of n Bernoulli trials (i.e., 0, 1, 2, . . . , n) is a plot of a binomial distribution. Thus, if we call each of the n Bernoulli trials the random variables X1 , X2 , . . . , Xn , respectively, and, for every jth trial Xj = 1 if the trial is a success, and 0 if it is a failure, then Sn = X1 + X2 + . . . + Xn is the number of successes out of n Bernoulli trials. Since we know, from the section on the Bernoulli distribution, that the expected value of each Bernoulli random variable Xj is:
    and that
    E(X1 + X2 + . . . + Xn ) = E(X1 ) + E(X2 ) + . . . + E(Xn ),
    it follows that the expected number of successes in a binomial distribution is given by:
    E(Sn ) = E(X1 ) + E(X2 ) + . . . + E(Xn ) = n × p.
    Similarly, the variance of the sum of n independent Bernoulli random variables X1, X2 , . . . , Xn is given by:
    VAR(X1 + X2 + . . . + Xn ) = VAR(X1 ) + VAR(X2 ) + . . . + VAR(Xn
  • Book cover image for: Statistics Unplugged
    Let’s consider a fairly simple distribution (see Table 2-11) and have a look at the calculation of the variance both mathematically and conceptually. Here’s the step-by-step approach that we’ll use: 1. Calculate each deviation and square it. Remember that you’re squaring the deviations because the sum of the deviations would equal 0 if you didn’t. 2. Sum all the squared deviations. 3. Divide the sum of the squared deviations by the number of cases. Applying this approach to the scores shown in Table 2-11, you can move through the process step by step. Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. Measures of Variability or Dispersion 37 Sum of the squared deviations equals 10 N = 5 (treating the 5 scores as a population) 10/5 = 2 Variance = 2 I assure you the same approach will work whether your distribution has small values (for example, from 1 to 10) or much larger values (for example, a distribution of incomes in the thousands of dollars). The example in Table 2-12 illustrates that the same approach works just the same when you’re dealing with larger values. To develop a solid understanding of what the variance tells us, consider the four distributions shown in Table 2-13. In the top two distributions, the variances are the same, but the means are very different. In the bottom two distributions, the means are equal, but the variances are very different. By now you should be developing some appreciation for the concept of variance, particularly in terms of how it can be used to compare one distribu-tion to another.
  • Book cover image for: Introduction to Statistical Data Analysis for the Life Sciences
    The mean is a weighted average of the possible values (zero and one) with weights equal to the corresponding probabilities. We look at all possible out-comes and the corresponding probabilities of observing them. We can write a binomial variable, Y ∼ bin ( n , p ) , as a sum of n indepen-dent Bernoulli variables, since this corresponds to the outcome of n indepen- 312 Introduction to Statistical Data Analysis for the Life Sciences dent trials that all have the same probability of success, p . Thus Y = n ∑ i = 1 Z i where Z i = 1 if trial i was a “success” 0 if trial i was a”failure” . The Z i ’s are independent and all have the same mean p , so the expected value of Y will be the sum of the means of each Z . * The expected value of Y is n ∑ i = 1 p = np . If we follow the same approach, we can calculate the variance of a bi-nomial distribution by starting with the variance of a Bernoulli variable, Z . The variance of a variable is defined as the average squared deviation from its mean, so we can calculate that as a weighted average of the squared de-viances with weights given as the probability of each observation: Var ( Z ) = ( 1 -p ) 2 · P ( Z = 1 ) + ( 0 -p ) 2 · P ( Z = 0 ) = ( 1 -p ) 2 p + p 2 ( 1 -p ) = p ( 1 -p ) . The variance of a sum of independent variables equals the sum of the indi-vidual variances, so Var ( Y ) = Var ( n ∑ i = 1 Z i ) = n ∑ i = 1 Var ( Z i ) = np ( 1 -p ) , (11.2) and since the standard deviation is the square root of the variance, we get that sd ( Y ) = q np ( 1 -p ) . It is worth noting that the one parameter for the binomial distribution, p , defines both the mean and the variance, and we can see that once we have the mean we can directly calculate the variance and standard deviation. Notice that the calculations for Z are completely identical to the calculations we saw in Example 4.8 on p. 88. Figure 11.3 shows the variance of a Bernoulli variable as a function of the parameter p .
  • Book cover image for: Finite Mathematics
    Similarly, there is a simple formula for the variance and standard deviation. Variance and Standard Deviation of a Binomial Random Variable If X is a binomial random variable associated with n independent Bernoulli trials, each with probability p of success, then the variance and standard devia-tion of X are given by s 2 5 npq and s 5 ! npq where q 5 1 2 p is the probability of failure. Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-300 620 Chapter 8 Random Variables and Statistics Internet Commerce You have calculated that there is a 40% chance that a hit on your web page results in a fee paid to your company, CyberPromo, Inc. Your web page receives 25 hits per day. Let X be the number of hits that result in payment of the fee (“successful hits”). a. What are the expected value and standard deviation of X ? b. Complete the following: On approximately 95 out of 100 days, I will get between ___ and ___ successful hits. Solution a. The random variable X is binomial with n 5 25 and p 5 .4 . To compute m and s , we use the formulas m 5 np 5 1 25 21 .4 2 5 10 successful hits s 5 ! npq 5 !1 25 21 .4 21 .6 2 < 2.45 hits. b. Because np 5 10 $ 10 and nq 5 1 25 21 .6 2 5 15 $ 10 , we can use the empirical rule, which tells us that there is an approximately 95% probability that the number of successful hits is within two standard deviations of the mean—that is, in the interval 3 m 2 2 s , m 1 2 s 4 5 3 10 2 2 1 2.45 2 , 10 1 2 1 2.45 24 5 3 5.1, 14.9 4 . Thus, on approximately 95 out of 100 days, I will get between 5.1 and 14.9 suc-cessful hits. EXAMPLE 5 For values of p near 1 > 2 and large values of n , a binomial distribution is bell shaped and (nearly) symmetric; hence, the empirical rule applies. One rule of thumb is that we can use the empirical rule when both np $ 10 and nq $ 10 . ✱ ✱ Remember that the empirical rule gives only an estimate of prob-abilities.
  • Book cover image for: Finite Mathematics and Applied Calculus
    Similarly, there is a simple formula for the variance and standard deviation. Variance and Standard Deviation of a Binomial Random Variable If X is a binomial random variable associated with n independent Bernoulli trials, each with probability p of success, then the variance and standard devia-tion of X are given by s 2 5 npq and s 5 ! npq where q 5 1 2 p is the probability of failure. Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-300 688 Chapter 9 Random Variables and Statistics Internet Commerce You have calculated that there is a 40% chance that a hit on your web page results in a fee paid to your company, CyberPromo, Inc. Your web page receives 25 hits per day. Let X be the number of hits that result in payment of the fee (“successful hits”). a. What are the expected value and standard deviation of X ? b. Complete the following: On approximately 95 out of 100 days, I will get between ___ and ___ successful hits. Solution a. The random variable X is binomial with n 5 25 and p 5 .4 . To compute m and s , we use the formulas m 5 np 5 1 25 21 .4 2 5 10 successful hits s 5 ! npq 5 !1 25 21 .4 21 .6 2 < 2.45 hits. b. Because np 5 10 $ 10 and nq 5 1 25 21 .6 2 5 15 $ 10 , we can use the empirical rule, which tells us that there is an approximately 95% probability that the number of successful hits is within two standard deviations of the mean—that is, in the interval 3 m 2 2 s , m 1 2 s 4 5 3 10 2 2 1 2.45 2 , 10 1 2 1 2.45 24 5 3 5.1, 14.9 4 . Thus, on approximately 95 out of 100 days, I will get between 5.1 and 14.9 suc-cessful hits. EXAMPLE 5 For values of p near 1 > 2 and large values of n , a binomial distribution is bell shaped and (nearly) symmetric; hence, the empirical rule applies. One rule of thumb is that we can use the empirical rule when both np $ 10 and nq $ 10 . ✱ ✱ Remember that the empirical rule gives only an estimate of prob-abilities.
  • Book cover image for: Statistics for the Behavioural Sciences
    eBook - ePub

    Statistics for the Behavioural Sciences

    An Introduction to Frequentist and Bayesian Approaches

    • Riccardo Russo(Author)
    • 2020(Publication Date)
    • Routledge
      (Publisher)
    p > 0.50. Figure 5.6 shows three binomial distributions where n = 12 and p is either 0.50 (symmetrical), 0.25 (positively skewed), or 0.75 (negatively skewed).
    Figure 5.6 Examples of binomial distributions with n = 12 and either p = 0.50 (symmetrical), or p = 0.25 (positively skewed), or p = 0.75 (negatively skewed). The scale on the vertical axis reports probabilities.

    5.8 Mean and variance of the binomial distribution

    It is possible to obtain, for any binomial distribution, the mean number of successes, their variance and standard deviation, using the following formulae:
    μ =
    E( X ) =  n × p
    σ 2
    =
    VAR( X ) =  n × p × ( 1 p )
    σ =
    n p ( 1 p )
    where X is a random variable with a binomial distribution, n is the total number of Bernoulli trials, and p is the probability of occurrence of a success on each trial.
    If you are curious to know how the above formulae are obtained, then read the following argument. First, let Sn be the number of successes out of n Bernoulli trials with probability p of success on each trial. Remember that the plotting of the probability of occurrence of r successes out of n Bernoulli trials (i.e., 0, 1, 2,…, n) is a plot of a binomial distribution. Thus, if we call each of the n Bernoulli trials the random variables X1 , X2 ,…, Xn , respectively, and, for every jth trial Xj = 1 if the trial is a success, and 0 if it is a failure, then Sn = X1 + X2 + ⋯ + Xn is the number of successes out of n Bernoulli trials. Since we know, from the section on the Bernoulli distribution, that the expected value of each Bernoulli random variable Xj
  • Book cover image for: RIGHTS REVERTED - Reasoning About Luck
    No longer available |Learn more

    RIGHTS REVERTED - Reasoning About Luck

    Probability and Its Uses in Physics

    (3.11) .
    This has been a heavy dose of algebra, but something has been achieved. We have associated two descriptive numbers with a probability distribution: the mean, μ , which tells where the weight of the distribution is centered; and the standard deviation, σ , which is a quantitative measure of deviations from the mean. We found that for N independent trials of a random experiment the mean is N × (the single trial mean), and the standard deviation is (the standard deviation for a single trial). (I am stating generally a result that we proved in all detail in a special case, because the proofs contain the seeds of a general argument.)
    Our analysis has explained some, but by no means all, of the observations we made by looking at drawings of the binomial distribution, and it has raised at least one new question. Perhaps the most intriguing observation, that a plain and featureless distribution could give birth to a series of elegant bell-shaped curves, has yet to be understood. And although we have shown that a natural measure of the width of the distribution (the standard deviation) scales with the square root of the number of trials, we are left with the question of how the standard deviation is related to the width defined as the range of outcomes containing 99% of the probability. Evidently, the two are not identical. For N tosses of a fair coin we found the 99% range to be given approximately by 1.3, on either side of the mean, whereas the standard deviation for this case, , is . If our observation has some generality to it, it would seem to be saying that 99% of the probability is contained in about 2.6 standard deviations on either side of the mean. To test if this is roughly true, we can try it out on Table 3.2 in which we listed the 99% probable range for the number of sixes on N rolls of a fair die. The hypothesis would predict the range to be . [Do you agree?] This formula gives the numbers 12.3, 17.3, 24.5, and 34.6, to compare with those in the last column of the table, i.e 12, 17, 24, 34. Not bad ! In fact, there is
  • Book cover image for: Applied Mathematics and Modeling for Chemical Engineers
    • Richard G. Rice, Duong D. Do, James E. Maneval(Authors)
    • 2023(Publication Date)
    • Wiley
      (Publisher)
    As with the zeroth absolute moment, m 0 = 1 for all distributions. However, the first central moment is also zero because m 1 = n ∑ i=0 (x i − )p i = n ∑ i=0 x i p i −  n ∑ i=0 p i =  −  = 0 for discrete distributions and m 1 = ∫ ∞ −∞ (x − )f (x)dx = ∫ ∞ −∞ x f (x)dx −  ∫ ∞ −∞ f (x)dx =  −  = 0 for continuous distributions. Special importance is attached to the second central moment, which is given the symbol  2 and is called the variance of the distribution  2 ≡ m 2 = n ∑ i=0 (x i − ) 2 p i = ∫ ∞ −∞ (x − ) 2 f (x)dx (7.17) The variance measures the magnitude of the deviations from the mean value of the distribution and so provides a measure of the spread or width of the distribution. The variance is also called the mean square error while the square root of the variance, , called the standard deviation. The units of the standard deviation are the same as the values of x. The variance of the binomial distribution is  2 = n ∑ i=0 (i − np) 2 ( n i ) p i (1 − p) n−i = np(1 − p) (7.18) while the variance for the normal distribution is  2 = 1 b √ 2 ∫ ∞ −∞ (x − a) 2 exp [ − (x − a) 2 2b 2 ] dx = b 2 (7.19) The results cited in Eqs. 7.14 and 7.15 as well as Eqs. 7.18 and 7.19 demonstrate two important points about the para- meters and statistics of the distributions of random variables. First, the statistics of a distribution are constant numerical values computed by summing or integrating the distribution function. They are not “random variables.” Second, once the parameters of a distribution are set, the distribution statistics will be expressed in terms of those parameters. In the examples here, both parameters of the binomial distribution (n and p) determine the mean and variance while for the normal distribution, the parameters connect to the statistics in a one-to-one fashion ( = a and  = b).
  • Book cover image for: Statistical Methods in Biology
    eBook - PDF

    Statistical Methods in Biology

    Design and Analysis of Experiments and Regression

    • S.J. Welham, S.A. Gezan, S.J. Clark, A. Mead(Authors)
    • 2014(Publication Date)
    For example, suppose we wish to find the median, i.e. quantile q = 0.5. In Figure 2.1b, we draw a horizontal line at height 0.5, and find that the smallest valid value (i.e. in the set 0, 1, 2, 3) with cumulative probability greater than this value is 1; hence, 1 is the median value for this distribution. The mean, or expected value , of a discrete random variable Y is calculated as E P ( ) ( ) . Y y y Y y S = ∈ ∑ (2.5) This equation is interpreted as ‘the sum, over all possible values of Y (i.e. for y ∈ S ), of the values multiplied by their point probabilities’. This is a measure of the location (average or mean value) of the distribution. Similarly, the spread of the distribution is measured by its variance, which can be expressed as Va r E P ( ) [ ( )] ( ) . Y y Y y y S Y = -∈ ∑ 2 (2.6) This expression (Equation 2.6), writes the variance as the sum, over all the possible values of Y , of the squared deviation of each value from the mean, multiplied by its point prob-ability. We can interpret these quantities as the mean and variance of a population that follows the given probability distribution. Unsurprisingly, the expression for the variance of the random variable in Equation 2.6 has a similar structure to that for the variance of a sample (Equation 2.2) and we explore this connection further below. The expected value (mean) of the Binomial distribution takes the form E( ) P Y y y m p y m y m y p p mp Y y m y m y y m = = --      = = -= ∑ ∑ ( ; , ) ! !( )! ( ) , 0 0 1 with variance Va r ( ) . ( ) ( ) ! !( )! ( ) Y y mp m y m y p p mp p y m y y m = ---      = --= ∑ 2 0 1 1 Obtaining the simplified forms requires mathematical manipulations outside the scope of this book (see for example Wackerly et al., 2007). Note that both the mean and the variance are functions of the population parameters m and p .
  • Book cover image for: Physical Mathematics
    15.2 Mean and Variance 571 Exponentiating both sides, we get the inequality of arithmetic and geometric means 1 n n  i =1 x i ≥  n  i =1 x i  1/n (15.42) (Johan Jensen, 1859–1925). The correlation coefficient or covariance of two variables x and y that occur with a joint distribution P (x , y ) is C [x , y ] ≡  P (x , y )(x − x )( y − y ) dxdy = (x − x )( y − y ) = x y  − x  y . (15.43) The variables x and y are said to be independent if P (x , y ) = P (x ) P ( y ). (15.44) Independence implies that the covariance vanishes, but C [x , y ] = 0 does not guarantee that x and y are independent (Roe, 2001, p. 9). The variance of x + y (x + y ) 2  − x + y  2 = x 2  − x  2 +  y 2  −  y  2 + 2 (x y  − x  y ) (15.45) is the sum V [x + y ] = V [x ] + V [ y ] + 2 C [x , y ]. (15.46) It follows (Exercise 15.6) that for any constants a and b the variance of ax + by is V [ax + by ] = a 2 V [x ] + b 2 V [ y ] + 2 ab C [x , y ]. (15.47) More generally (Exercise 15.7), the variance of the sum a 1 x 1 + a 2 x 2 + · · · + a N x N is V [a 1 x 1 + · · · + a N x N ] = N  j =1 a 2 j V [x j ] + N  j ,k =1, j
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.