Technology & Engineering
Hypergeometric Distribution
The hypergeometric distribution is a probability distribution that describes the number of successes in a sequence of draws without replacement from a finite population. It is commonly used in quality control, reliability engineering, and sampling inspection to calculate the probability of obtaining a specific number of defective items in a sample drawn from a larger population.
Written by Perlego with AI-assistance
Related key terms
1 of 5
9 Key excerpts on "Hypergeometric Distribution"
- Rajan Chattamvelli, Ramalingam Shanmugam(Authors)
- 2022(Publication Date)
- Springer(Publisher)
135 C H A P T E R 7 Hypergeometric Distribution After finishing the chapter, readers will be able to : : : • Understand Hypergeometric Distribution. • Describe Hypergeometric Distribution and its properties. • Explain Capture-Mark-Recapture model. • Apply Hypergeometric Distribution in practical situations. 7.1 INTRODUCTION The classical Hypergeometric Distribution was introduced by the French mathematician Abra- ham De Moivre 1 in 1711. This was applied to some practical problems by Cournot (1843) [23]. Suppose a researcher is interested in studying a group of N entities (people, places, planets, ma- chines, living organisms, web sites, software programs, data packets, or human-built structures like dams, buildings, bridges, etc.) that either possess an attribute or does not. The attribute may be a condition, property, distinctive nature, characteristic, or feature. It is known from past data that p percent of the entities possess the attribute, so that there are Np entities with the attribute, and the remaining Nq entities that do not have it. 2 In other words, among the N items, k D Np are of one kind, and the rest .N k/ D Nq are of another kind. We assume that the two kinds are indistinguishable. Practical experiments that result in a Hypergeometric Distribution can be characterized by the following properties: 1. The experiment consists of sampling without replacement from a dichotomous population. 2. The trials can be repeated independently n times under identical conditions. 3. The probability of occurrence of the outcomes p k varies (increases) from trial to trial until the experiment is over. 4. The random variable X denotes the number of times one of the dichotomous groups is selected. 1 De Moivre obtained it as a solution to the urn problem proposed by Huygens. 2 It is assumed here that p C q D 1, so that Np C Nq D N . 136 7. Hypergeometric Distribution Suppose we sample n items without replacement from the set (called a hypergeometric trial).- eBook - PDF
- Ken Black, Ignacio Castillo, Amy Goldlist, Timothy Edmunds(Authors)
- 2018(Publication Date)
- Wiley(Publisher)
Hypergeometric Distribution 217 Relationship between the Hypergeometric and Binomial Distributions On the surface, the “experimental” situations for the Hypergeometric Distribution and the binomial distribution are very similar—there is a fixed number of trials, each with a fixed probability of success, and we are interested in the probability of achieving a specific number of successes. Judging by the formulas, though, the distributions seem to behave very differently— the binomial distribution involves probabilities being raised to exponential powers, while the Hypergeometric Distribution involves only a collection of counting rules. It turns out, however, that the Hypergeometric Distribution and the binomial distribution are more closely related than appears from their equations. In fact, as the size of the popu- lation (compared to the number of trials) in the Hypergeometric Distribution increases, the distribution becomes more and more similar to the binomial distribution. We won’t prove this mathematically (though the interested reader is encouraged to do so), but we will demonstrate by revisiting the series of distributions from Figure 7.5. Recall that in each of these distributions the ratio of the number of available successes (A) to the overall population size (N) is the same: = 4/5 0.8. Bearing this in mind, in Figure 7.6 we compare the final Hypergeometric Distribution from our series with the binomial distri- bution for the same number of trials and 0.8 π = (which we already saw in Figure 7.3). You can see that the distributions are virtually identical; recall that this was not the case when the population was smaller—the first few distributions in Figure 7.5 were significantly different from the final one, even though they all had a 0.8 ratio of successes to population size. - eBook - PDF
Reliability Engineering and Risk Analysis
A Practical Guide, Third Edition
- Mohammad Modarres, Mark P. Kaminskiy, Vasiliy Krivtsov(Authors)
- 2016(Publication Date)
- CRC Press(Publisher)
(2.47) The Hypergeometric Distribution is commonly used in statistical quality control and acceptance– rejection test practices. This distribution approaches the binomial one with parameters p = D / N and n , when the ratio n / N becomes small. EXAMPLE 2.12 A manufacturer has a stockpile of 286 computer units. It is known that 121 of the units are more reliable than the other units. If a random sample of four computer units is selected without replacement, what is the probability that no units, two units, and all four units are from high-reliability units? S OLUTION Use Equation 2.46 with x = number of high-reliability units in the sample n = number of units in the selected sample 32 Reliability Engineering and Risk Analysis n − x = number of nonhigh-reliability units in the sample N = number of units in the stockpile D = number of high-reliability units in the stockpile N − D = number of nonhigh-reliability units in the stockpile Possible values of x are 0 ≤ x ≤ 4. The results of the calculations are given in the following table: x Pr( x ) 0 121 286 121 4 0 286 4 0 0 109 -- = . 2 121 2 286 121 4 2 286 4 0 360 -- = . 4 0.031 2.3.2.4 Poisson’s Distribution This model assumes that objects or events of interest are evenly dispersed at random in a time or space domain, with some constant intensity λ . For example, r.v. X can represent the number of fail-ures observed at a process plant during a year (time domain) or the number of buses arriving at a given station during 2 h (time domain), if they arrive randomly and independently in time. It can also represent the number of cracks or flaws in a given area of a metal sheet (space domain). It is clear that an r.v. X following the Poisson distribution is, in a sense, a number of random events, so that it takes on only integer values. - No longer available |Learn more
- Anthony Hayter(Author)
- 2012(Publication Date)
- Cengage Learning EMEA(Publisher)
3.3 THE Hypergeometric Distribution 169 However, if n items are chosen at random without replacement , then the binomial distri-bution cannot be applied because the success probability, that is, the probability of selecting an item of the special kind, is no longer constant. Instead, the appropriate distribution for the number of defective items chosen is the Hypergeometric Distribution . The probability mass function of the Hypergeometric Distribution is P ( X = x ) = r x × N − r n − x N n for max { 0 , n + r − N } ≤ x ≤ min { n , r } . Notice that the number of defective items x must take a value at least as large as n + r − N if this is positive. This is because when n + r − N is positive, the sample size n is larger than the number of nondefective items N − r , so that the sample must contain at least n − ( N − r ) defective items. Also, the number of defective items x cannot be larger than either the sample size n or the total number of defective items r , whichever is smaller. The Hypergeometric Distribution The Hypergeometric Distribution has a probability mass function given by P ( X = x ) = r x × N − r n − x N n for max { 0 , n + r − N } ≤ x ≤ min { n , r } , with an expected value of E ( X ) = nr N and a variance of Var ( X ) = N − n N − 1 × n × r N × 1 − r N It represents the distribution of the number of items of a certain kind in a random sample of size n drawn without replacement from a population of size N that contains r items of this kind. The Hypergeometric Distribution was encountered in Section 1.7 during the discussion of Example 2 on defective computer chips. In this example, a box of N = 500 computer chips contains r = 9 defective chips, and n = 3 chips are selected at random without replacement. The total number of possible samples that can be taken is 500 3 which in general is N n The number of samples containing exactly one defective chip is 9 1 × 491 2 Copyright 2011 Cengage Learning. - eBook - PDF
- John J. Kinney(Author)
- 2009(Publication Date)
- Wiley(Publisher)
It is also true that the graphs of the Hypergeometric Distribution show the same “bell-shaped” characteristic that we have encountered several times now, and it will be encountered again. We end this chapter with another probability distribution that we have actually seen before, the geometric distribution. There are hundreds of other discrete probabil- ity distributions. Those considered here are a sample of these, although the sampling has been purposeful; we have discussed some of the most common distributions. GEOMETRIC PROBABILITY DISTRIBUTION In the binomial distribution, we have a fixed number of trials, and the random variable is the number of successes. In many situations, however, we wait for the first success, and the number of trials to achieve that success is the random variable. In Examples 1.4 and 1.12, we discussed a sample space in which we sampled items emerging from a production line that can be characterized as good (G) or defec- tive (D). We discussed a waiting time problem, namely, waiting for a defective item to occur. We presumed that the selections are independent and showed the following sample space: S = ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ D GD GGD GGGD . . . ⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ Later, we showed that no matter the size of the probability an item was good or defective, the probability assigned to the entire sample space is 1. Notice that in the binomial random variable, we have a fixed number of trials, say n, and a variable number of successes. In waiting time problems, we have a given number of successes (here 1); the number of trials to achieve those successes is the random variable. Conclusions 73 Let us begin with the following waiting time problem. In taking a driver’s test, suppose that the probability the test is passed is 0.8, the trials are independent, and the probability of passing the test remains constant. - eBook - PDF
Environmental Risk Analysis
Probability Distribution Calculations
- Louis Theodore(Author)
- 2015(Publication Date)
- CRC Press(Publisher)
110 Environmental Risk Analysis: Probability Distribution Calculations and x x x x x x x x 1 2 1 2 1 2 1 2 1 0 1 1 1 2 1 3 = = = = = = = = ; ; ; ; After substitution, P x x ( , ) . % 1 2 1 4 0 10 10 ³ ³ » » Is this unreasonable? The reader can make the call. Illustrative Example 6.9 Refer to the previous example. Calculate the probability of the sample containing exactly 2 scrap pieces. Solution For this case, p p 1 2 0 01 0 99 = = . . ( ) not scrap One can now employ the binomial distribution/theorem, that is, P x 1 2 4 8 4 2 50 2 48 0 01 0 99 1228 10 0 6171 0 = ( ) = ( )( ) ( ) ( ) = = -! ! ! . . ( )( )( . ) . . % 075 7 5 = REFERENCES 1. L. Theodore and F. Taylor, Probability and Statistics , Theodore Tutorials (originally published by USEPA, RTP, NC), East Williston, NY, 1996. 2. S. Shaefer and L. Theodore, Probability and Statistics Applications in Environmental Science , CRC Press/Taylor & Francis Group, Boca Raton, FL, 2007. 3. L. Theodore, personal notes, East Williston, NY, 1980. 4. L. Theodore, Heat Transfer Applications for the Practicing Engineer , John Wiley & Sons, Hoboken, NJ, 2012. 111 7 Hypergeometric Distribution INTRODUCTION The Hypergeometric Distribution is an applicable situation in which a random sample of r items is drawn without replacement from a set of n items. Without replace-ment means that an item is not returned to the set after it is drawn. Recall that the binomial distribution is frequently applicable in cases where the item is drawn with replacement . Suppose that it is possible to classify each of the n items as a success or failure . Again, the words success and failure do not have the usual connotation. They are merely labels for two mutually exclusive categories into which n items have been classified. Thus, each element of the population may be dichotomized as belonging to one of two disjointed classes. Let a be the number of items in the category labeled success. - eBook - ePub
- Kocherlakota(Author)
- 2017(Publication Date)
- CRC Press(Publisher)
6 SAMPLING FROM FINITE POPULATIONS 6.1 IntroductionIn the univariate case, the Hypergeometric Distribution arises in sampling from a finite population with a dichotomy of items, provided that the sampling is without replacement. Thus, we may consider a population of N items consisting of N1 of Type I and N2 of Type II. A random sample of n items is drawn, without replacement, from this population. Let the random variable X be defined asX = number of Type I items appearing in the sample .Then it can be seen from the basic counting principle that theP{=X = x},()N 1x()N 2n − x(N n)max[≤ x ≤ min0 , n −]N 2[.n ,]N 1(6.1.1) The crucial desiderata in this development are the finite characteristic of the population and that the sampling is without replacement. These two criteria lead to the dependence of the trials or draws. If we relax the first of these two requirements, then the distribution in (6.1.1 ) approaches that of the binomial with the parameters n and p = N1 /N. On the other hand, if we sample with replacement from the finite population, (6.1.1 ) is replaced by the binomial distribution with the same parameters.6.2 Bivariate generalizationsThere are two distinct types of bivariate Hypergeometric Distributions that arise in practice. The first type, like the Type I bivariate binomial, is based on the double dichotomy. Wicksell (1923) showed that in a sample taken without replacement from a finite population the joint distribution of the marginal totals is a bivariate hypergeometric.The second type of bivariate hypergeometric is an analog of the trinomial distribution and seems to have been first developed by Guldberg (1934). 6.2.1 Double dichotomyConsider the following representation of a population of N items.A random sample of size n is taken from this population without replacement. The corresponding representation of one observed sample - eBook - PDF
- Joseph Arthur Greenwood, H. O. Hartley(Authors)
- 2017(Publication Date)
- Princeton University Press(Publisher)
4 »35 5.1 SECTION 5 VARIOUS DISCRETE AND GEOMETRICAL PROBABILITIES 5.1 The Hypergeometric Distribution The Hypergeometric Distribution may be written in the form "·"• = ri = · r = o(i) »· p*o = ι · <«·ι·» When the distribution arises in sampling from a finite population, Np and Nq are integers. Most extant tables of the Hypergeometric Distribution have been compiled for the purpose of testing independence in 2 χ 2 contingency tables: see section 2.6 , particularly Finney 1948, Latscha 1953, Armsen 1955. Hypergeometric functions, and confluent hypergeometric functions, are ubiquitous in statistical computations: see Erdelyi et al. 1953, p. 87, for the representation of the incomplete beta-function as a hypergeometric function, and Erdelyi et al. 1953, pp. 266-267, for the representations of the incomplete gamma-function and the Hermite polynomials as confluent hypergeometric functions. For the representation of the non-central F- distribution in terms of confluent hypergeometric functions, see Tang 1938, Lehmer 1944, N.B.S. 1949, 1951. For a list of tables of confluent hyper- geometric functions, see section 4.142 . The epithet 'hypergeometric' is applied to the distribution (5.1.1) because its probability generating function (Kendall & Stuart 1958, p. 133) and its factorial moment generating function (Aitken 1939, p. 57) are hypergeometric functions. This fact, of some importance in statistical combinatorics, is not helpful to the table user, because (Fletcher et al. 1946, p. 335) the "tabulation of the general hyper- geometric function ΡΙα,β.γ.χ) would call for tables with 4 arguments, and [they] have not encountered any direct tabulation." Fletcher et.al. 5.1 go on to point out that K. Pearson Elderton 1923 and K. Pearson 1931b tabulate certain hypergeometric functions. For the definition of the bivariate Hypergeometric Distribution see Aitken 1939, p. 84; and for an example, see K. Pearson 1924. For the Hypergeometric Distribution with negative n see K. - eBook - PDF
- Lyle Albright(Author)
- 2008(Publication Date)
- CRC Press(Publisher)
2003. Applied Statistics and Probability for Engineers , 3rd ed., New York: John Wiley & Sons. Metcalfe, Andrew V., Statistics in Engineering , 1994. London: Chapman & Hall. Fraser, D.A.S, Probability and Statistics: Theory and Applications , 1976, North Scituate, MA: Duxbury Press. μ = ∑ x P x i i i ( ) μ = −∞ ∞ ∫ xf x dx ( ) σ μ 2 2 = − ∑ ( ) ( ) x P x i i i σ μ 2 2 = − −∞ ∞ ∫ ( ) ( ) x f x dx P x n n x p p x n x ( , ) ( ) = ⎛ ⎝ ⎜ ⎞ ⎠ ⎟ − − 1 n x n x n x ⎛ ⎝ ⎜ ⎞ ⎠ ⎟ = − ! !( )! n x ⎛ ⎝ ⎜ ⎞ ⎠ ⎟ Engineering Statistics 205 3.2.2 G EOMETRIC D ISTRIBUTION —D ISCRETE V ARIABLE The geometric distribution indicates the probability of conducting x trials to obtain a success in an experiment in which there are only two possible outcomes. Like the binomial distribution, this is another Bernoulli process. Each trial is assumed to be independent, and the probability of observing a success is constant over all trials, denoted p . The probability distribution for the geometric distribution [2] is (3.8) Similar to the binomial distribution, Equation (3.8) can also be derived intuitively. Since there is only one set of experimental trials that results in x − 1 failures followed by one success, the distribution has no binomial coefficient or redundancy. Consequently, the probability of the first success occurring in the x th trial is the probability of x − 1 failures multiplied by the probability of one success. The expected value of the geometric distribution is μ = 1/ p , and the variance is σ 2 = (1 − p )/ p 2 [2]. Example 3.5 For a six-sided die, what is the probability of a chosen face turning up for the first time on the fourth throw? Solution For each throw of the die, the probability of a given face showing is p = 1/6. Since we desire the first success to be on the fourth throw, x = 4.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.








