Mathematics

Hypothesis Test of Two Population Proportions

The hypothesis test of two population proportions is a statistical method used to compare the proportions of two different populations. It involves formulating null and alternative hypotheses, calculating the test statistic, and determining the p-value to make inferences about whether the proportions are significantly different. This test is commonly used in research and decision-making processes to assess the significance of differences between population proportions.

Written by Perlego with AI-assistance

12 Key excerpts on "Hypothesis Test of Two Population Proportions"

  • Book cover image for: Business Statistics for Contemporary Decision Making
    • Ken Black, Tiffany Bayley, Ignacio Castillo(Authors)
    • 2023(Publication Date)
    • Wiley
      (Publisher)
    10.4 Statistical Inferences About Two Population Proportions LEARNING OBJECTIVE 10.4 Test hypotheses and develop confidence intervals about the difference in two population proportions. 10.4 Statistical Inferences About Two Population Proportions 359 Sometimes an analyst wishes to make inferences about the difference in two population proportions. This type of analysis has many applications in business, such as comparing the market share of a product for two different markets, studying the difference in the propor- tion of female customers in two different geographic regions, or comparing the proportion of defective products from one period to another. In making inferences about the difference in two population proportions, the statistic normally used is the difference in the sample pro- portions: p ̂ 1 − p ̂ 2 . This statistic is computed by taking random samples, determining p ̂ for each sample for a given characteristic, and then calculating the difference in these sample proportions. The central limit theorem states that, for large samples (each of n 1 · p ̂ 1 , n 1 · q ̂ 1 , n 2 · p ̂ 2 , and n 2 · q ̂ 2 , where q ̂ = 1 − p ̂ ), the difference in sample proportions is normally distributed with a mean difference of: μ p ̂ 1 − p ̂ 2 = p 1 − p 2 and a standard deviation of the difference of sample proportions of: σ p ̂ 1 − p ̂ 2 = √ _______________________ p 1 · q 1 _ n 1 + p 2 · q 2 _ n 2 From this information, a z formula for the difference in sample proportions can be developed.
  • Book cover image for: Essential Statistics for Economics, Business and Management
    • Teresa Bradley(Author)
    • 2014(Publication Date)
    • Wiley
      (Publisher)
    T E S T S O F H Y P O T H E S I S F O R M E A N S A N D P R O P O R T I O N S 8 8.1 Hypothesis tests for means 8.2 Hypothesis tests for proportions 8.3 Hypothesis tests for the difference between means and proportions 8.4 Minitab and Excel for confidence intervals and tests of hypothesis Chapter Objectives Having carefully studied this chapter and completed the exercises you should be able to do the following  Define a null and alternative hypothesis  Calculate the p -value: the probability of obtaining the given sample or a more extreme sample assuming the null hypothesis is true  Explain the decision rule at a given level of significance for one-sided and two-sided tests  Test hypothesis for population means, proportions and difference between two means and two proportions  Explain the terms Type I error, Type II error, the power of a test  Explain the effect of Z α/2 , σ and n on the power of a test  Calculate the sample size required for an interval estimate of a population mean or proportion for a given level of confidence and precision: (large n)  Use Excel and Minitab to calculate confidence intervals and test hypothesis  Interpret and explain verbally Minitab printouts for confidence intervals and tests of hypothesis 8.1 Hypothesis tests for means 8.1.1 Null and alternative hypotheses Up to this point we have used sample means (and proportions) to estimate population means (and proportions). Some inference about the population mean can be made from a confidence interval; for example in Worked Example 7.1 it was concluded that the true mean weight of packets of saffron was [ 320 ] C H A P T E R 8 not 20 since 20 was not within the 95 % confidence interval. Similarly inference about differences between two population means or proportions can be made as outlined in Worked Examples 7.4 and 7.5. Testing a statistical hypothesis is a different approach to making inference about a population parameter on the basis of a random sample.
  • Book cover image for: Understanding Business Statistics
    • Ned Freed, Stacey Jones, Timothy Bergquist(Authors)
    • 2013(Publication Date)
    • Wiley
      (Publisher)
    In fact, most people who repeat the test on two different days have scores that differ by far more than a mere three points. WHAT’S AHEAD: Users of statistical information need to understand the difference between statistical significance and practical importance. In this chapter, we’ll introduce additional significance tests and learn to interpret what they do and don’t tell us. EVERYDAY STATISTICS Smell Test The great tragedy of science is the slaying of a beautiful hypothesis by an ugly fact.—Thomas H. Huxley 10.1 Tests for a Population Proportion 349 Scott Olson/Getty Images 350 C H A P T E R 1 0 Hypothesis Tests for Proportions, Mean Differences and Proportion Differences Following the pattern of the Chapter 7 and Chapter 8 sequence, we can now extend our Chapter 9 hypothesis testing discussion to three additional cases: • Hypothesis tests for a population proportion. • Hypothesis tests for the difference between two population means. • Hypothesis tests for the difference between two population proportions. If you feel comfortable with the elements of hypothesis testing that were intro- duced in Chapter 9, the discussion here should be easy to follow. In fact, we’ll proceed at a slightly quicker pace, relying more on examples and less on com- prehensive explanations to develop the ideas that we need. 10.1 Tests for a Population Proportion We’ll start by constructing a test for a population proportion. Consider the following situation: Situation: PowerPro uses batch processing to produce the lithium-polymer batteries that Apple uses in its iPad and iPad Mini. Once a batch is completed, a quality inspector tests a sample of 100 batteries. If sample results lead the inspector to conclude that the proportion of defective batteries in the batch exceeds 6%—PowerPro’s standard for what constitutes an acceptable batch—the entire batch will be scrapped and replaced.
  • Book cover image for: Introductory Statistics
    • Prem S. Mann(Author)
    • 2020(Publication Date)
    • Wiley
      (Publisher)
    Andrey Armyagov/iStockphoto 464 CHAPTER 10 Estimation and Hypothesis Testing: Two Populations Using the value of the pooled sample proportion, we compute an estimate of the standard deviation of p ̂ 1 − p ̂ 2 as follows: s p ̂ 1 − p ̂ 2 = √ ____________ ¯ p ¯ q ( 1 __ n 1 + 1 __ n 2 ) where ¯ q = 1 − ¯ p . Test Statistic z for p ̂ 1 − p ̂ 2 The value of the test statistic z for p ̂ 1 − p ̂ 2 is calculated as z = ( p ̂ 1 − p ̂ 2 ) − ( p 1 − p 2 ) _________________ s p ̂ 1 − p ̂ 2 The value of p 1 − p 2 is substituted from H 0 , which usually is zero. Examples 10.14 and 10.15 illustrate the procedure to test hypotheses about the difference between two population proportions for large samples. Making a Right-Tailed Test of Hypothesis About p 1 − p 2 : Large and Independent Samples Reconsider Example 10.13 about the percentages of users of two toothpastes who will never switch to another toothpaste. At a 1% significance level, can you conclude that the proportion of users of Toothpaste A who will never switch to another toothpaste is higher than the proportion of users of Toothpaste B who will never switch to another toothpaste? Solution Let p 1 and p 2 be the proportions of all users of Toothpastes A and B, respectively, who will never switch to another toothpaste, and let p ̂ 1 and p ̂ 2 be the corresponding sample proportions. Let x 1 and x 2 be the number of users of Toothpastes A and B, respectively, in the two samples who said that they will never switch to another toothpaste. From the given information, Toothpaste A: n 1 = 500 and x 1 = 100 Toothpaste B: n 2 = 400 and x 2 = 68 The significance level is α = .01. The two sample proportions are calculated as follows: p ̂ 1 = x 1 / n 1 = 100 / 500 = .20 p ̂ 2 = x 2 / n 2 = 68 / 400 = .17 STEP 1 State the null and alternative hypotheses.
  • Book cover image for: Mind on Statistics (with JMP Printed Access Card)
    The power of the test will be low. • With a large sample, even a small and unimportant difference between the null value and the true population value may lead to a conclusion of statistical significance. EXAMPLE 12.18 How the Same Sample Proportion Can Produce Different Conclusions In Example 10.11 of Chapter 10, we used a confidence interval to analyze a taste test in which 55% of the 60 participants liked the taste of drink A better than the taste of drink B. On the basis of the 95% confidence interval for the “true” proportion that prefers drink A, we were unable to conclude whether a majority of the population prefers the taste of drink A. We can also examine these data with a hypothesis test in which the null hypothesis is that there is no general preference for either drink. The null and alternative hypoth-eses are as follows: H 0 : p .5 (no preference) H a : p .5 (preference for one or the other) where p represents the proportion in the population that would prefer drink A. Now, suppose that 16 times as many people participated, for a sample size of 960, and that the result was still that 55% of the sample prefers drink A. What effect will the larger sample size have on the statistical significance of the data? Using the z (standard normal) distribution, the p -value is the area in the tail(s) beyond the test statistic z as follows (refer to Table 12.1 on page 475 for illustrations): For H a : p 1 p 2 0 , the p -value is 2 area above | z | (a two-tailed test). For H a : p 1 p 2 0 , the p -value is the area above z, even if z is negative. For H a : p 1 p 2 0 , the p -value is the area below z, even if z is positive. For Steps 4 and 5 , proceed as instructed on page 488. Watch a video explanation of this example at the course website, http://www .cengage.com/statistics/Utts5e. Copyright 2014 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.
  • Book cover image for: Business Statistics
    eBook - PDF

    Business Statistics

    For Contemporary Decision Making

    • Ken Black(Author)
    • 2023(Publication Date)
    • Wiley
      (Publisher)
    Use of the t test for two independent populations is not unethical when the populations are related, but it is likely to result in a loss of power. As with any hypothesis testing procedure, in determining the null and alternative hypotheses, make certain you are not assuming true what you are trying to prove. Summary Business analytics often requires the analysis of two populations. Three types of parameters can be compared: means, proportions, and variances. Except for the F test for population variances, all tech- niques presented contain both confidence intervals and hypothesis tests. In each case, the two populations are studied through the use of sample data randomly drawn from each population. The population means are analyzed by comparing two sample means. When sample sizes are large (n ≥ 30) and population vari- ances are known, a z test is used. When sample sizes are small, the population variances are known, and the populations are normally distributed, the z test is used to analyze the population means. If the population variances are unknown and the populations are normally distributed, the t test of means for independent samples is used. For populations that are related on some measure, such as twins or before-and-after, a t test for dependent measures (matched pairs) is used. An assumption for this test is that the differences of the two populations are normally distributed. The difference in two popula- tion proportions can be tested or estimated using a z test. The population variances are analyzed by an F test when the assumption that the populations are normally distributed is met. The F value is a ratio of the two variances. The F distribution is a distribu- tion of possible ratios of two sample variances taken from one popula- tion or from two populations containing the same variance.
  • Book cover image for: Essentials of Business Statistics
    • Ken Black, Ignacio Castillo, Amy Goldlist, Timothy Edmunds(Authors)
    • 2018(Publication Date)
    • Wiley
      (Publisher)
    Setting up the formula to calculate the value of z test is relatively straightforward, but there is no convenient one-step function. If you will frequently need a simple tool to perform hypothesis tests of the difference between two proportions, we recommend using a third party add-in, or using a different program. 12.3 Concept Check 1. Why is formula 12.6 used instead of formula 12.5 when dealing with hypothesis tests about two population proportions? 2. How large should each of n p n p n p n p ˆ , (1 ˆ ), ˆ , and (1 ˆ ) a a a a b b b b − − be to be considered large samples and thus for the difference in sample proportions to be normally distributed? 12.3 Problems 12.14 Using the given sample information, test the following hypotheses. a. Difference Group A Group B = − H 0 : 0.05 δ = H a : 0.05 δ ≠ Sample A Sample B n a = 368 n b = 405 x a = 175 x b = 162 Let α = 0.05. Note that x is the number in the sample with the characteristic of interest. b. Difference Group A Group B = − H 0 : 0 δ = H a : 0 δ ≠ Sample A Sample B n a = 649 n b = 558 p ˆ a = 0.38 p ˆ b = 0.25 Let α = 0.10. 12.15 According to a study conducted for a computer manufacturer, 59% of men and 70% of women say that weight is an extremely/very important factor in purchasing a laptop computer. Suppose this sur- vey was conducted using 374 men and 481 women. Does this data set 2.55 0 –2.55 z: –2.54 CRIT CRIT TEST = 0 – π 2 π 1 = 0.005 2 α = 0.005 2 α ^ p 2 : p 1 ^ – Step 8. At the 1% level of significance, we do not have enough evidence to conclude that a greater proportion of female entrepreneurs in the higher gross sales category define success as sales/profit (p = 0.0110). It is worth noting that the low p value might mean that it is worthwhile to conduct a second sample and repeat the tests.
  • Book cover image for: Essentials of Modern Business Statistics with Microsoft® Excel®
    • David Anderson, Dennis Sweeney, Thomas Williams, Jeffrey Camm(Authors)
    • 2020(Publication Date)
    This conclusion indicates a quality differential between the two centers and suggests that a follow-up study investigating the reason for the differential may be warranted. The null and alternative hypotheses for this two-tailed test are written as follows. H 0 : H a : m 1 2 m 2 5 0 m 1 2 m 2 ± 0 The standardized examination given previously in a variety of settings always resulted in an examination score standard deviation near 10 points. Thus, we will use this information to assume that the population standard deviations are known with s 1 5 10 and s 2 5 10. An a 5 .05 level of significance is specified for the study. Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 452 Chapter 10 Inference About Means and Proportions with Two Populations Independent random samples of n 1 5 30 individuals from training center A and n 2 5 40 individuals from training center B are taken. The respective sample means are x 1 5 82 and x 2 5 78. Do these data suggest a significant difference between the population means at the two training centers? To help answer this question, we compute the test statistic using equation (10.5). z 5 s x 1 2 x 2 d 2 D 0 Î s 2 1 n 1 1 s 2 2 n 2 5 s 82 2 78 d 2 0 Î 10 2 30 1 10 2 40 5 1.66 Next let us compute the p -value for this two-tailed test. Because the test statistic z is in the upper tail, we first compute the upper tail area corresponding to z 5 1.66. Using the standard normal distribution table, the area to the left of z 5 1.66 is .9515. Thus, the area in the upper tail of the distribution is 1.0000 2 .9515 5 .0485.
  • Book cover image for: Introductory Statistics
    • Prem S. Mann(Author)
    • 2016(Publication Date)
    • Wiley
      (Publisher)
    432 Chapter 10 Estimation and Hypothesis Testing: Two Populations a. Let p 1 and p 2 be the proportion of all women and men American drivers, respectively, who will say that they exceeded the speed limit at least once in the past week. Construct a 98% confidence interval for p 1 − p 2 . b. Using a 1% significance level, can you conclude that p 1 is lower than p 2 ? Use both the critical-value and the p-value approaches. 10.51 A state that requires periodic emission tests of cars operates two emission test stations, A and B, in one of its towns. Car owners have complained of lack of uniformity of procedures at the two stations, resulting in different failure rates. A sample of 400 cars at Station A showed that 53 of those failed the test; a sample of 470 cars at Station B found that 51 of those failed the test. a. What is the point estimate of the difference between the two population proportions? b. Construct a 95% confidence interval for the difference between the two population proportions. c. Testing at a 5% significance level, can you conclude that the two population proportions are different? Use both the critical-value and the p-value approaches. 10.52 The management of a supermarket chain wanted to investi- gate if the percentages of men and women who prefer to buy national brand products over the store brand products are different. A sample of 600 men shoppers at the company’s supermarkets showed that 246 of them prefer to buy national brand products over the store brand products. Another sample of 700 women shoppers at the company’s supermarkets showed that 266 of them prefer to buy national brand products over the store brand products. a. What is the point estimate of the difference between the two population proportions? b. Construct a 98% confidence interval for the difference between the proportions of all men and all women shoppers at these supermarkets who prefer to buy national brand products over the store brand products.
  • Book cover image for: Biostatistics
    eBook - PDF

    Biostatistics

    A Foundation for Analysis in the Health Sciences

    • Wayne W. Daniel, Chad L. Cross(Authors)
    • 2018(Publication Date)
    • Wiley
      (Publisher)
    Summary of Formulas for Chapter 7 253 Formula Number Name Formula 7.6.1, 7.6.2 Test statistic for the difference between two population proportions z = (  p 1 −  p 2 ) − (p 1 − p 2 ) 0    p 1 − p 2 , where p = x 1 + x 2 n 1 + n 2 , and    p 1 − p 2 = √ p(1 − p) n 1 + p(1 − p) n 2 7.7.1 Test statistic for a single population variance  2 = (n − 1)s 2  2 7.8.1 Variance ratio V.R. = s 2 1 s 2 2 7.9.1, 7.9.2 Upper and lower critical values for x x U =  0 + z  √ n x L =  0 − z  √ n 7.10.1, 7.10.2 Critical value for determining sample size to control type II errors C =  0 − z 0  √ n =  1 + z 1  √ n 7.10.3 Sample size to control type II errors n = [ (z 0 + z 1 ) ( 0 −  1 ) ] 2 Symbol Key •  = type 1 error rate • C = critical value •  2 = chi-square distribution • d = average difference •  = mean of population •  0 = hypothesized mean • n = sample size • p = proportion for population • p = average proportion • q = (1 − p) •  p = estimated proportion for sample •  2 = population variance •  = population standard deviation •  d = standard error of difference •  x = standard error • s = standard deviation of sample • s d = standard deviation of the difference • s p = pooled standard deviation • t = Student’s t-transformation • t ′ = Cochran’s correction to t • x = mean of sample • x L = lower limit of critical value for x • x U = upper limit of critical value for x • z = standard normal transformation 254 HYPOTHESIS TESTING R E V I E W Q U E S T I O N S A N D E X E R C I S E S 1. What is the purpose of hypothesis testing? 2. What is a hypothesis? 3. List and explain each step in the ten-step hypothesis testing procedure. 4. Define: (a) Type I error (b) Type II error (c) The power of a test (d) Power function (e) Power curve (f) Operating characteristic curve 5. Explain the difference between the power curves for one-sided tests and two-sided tests.
  • Book cover image for: Introduction to Statistics and Data Analysis
    • Roxy Peck, Chris Olsen, , Tom Short, Roxy Peck, Chris Olsen, Tom Short(Authors)
    • 2019(Publication Date)
    For this example, if a large-sample z test had been used, then the resulting P -value of 0.124 would be quite different than the P -value from the randomization test. This illustrates why a hypothesis test should not be used when its assumptions are not met. An Exact Binomial Test for One Proportion Another way to obtain a P -value when testing hypotheses about a population proportion is to use an exact probability approach that uses the binomial distribution. The binomial prob-ability distribution was introduced in Section 7.5, Binomial and Geometric Distributions. Example 10.23 An Exact Binomial Test Consider a test of the null hypothesis H 0 : p 5 0.50. The value of p that is specified in the null hypothesis, when combined with an observed sample size, identifies a specific binomi-al distribution that can be used to calculate an “exact” P -value. Suppose that the alternative hypothesis of interest is H a : p > 0.50 and that the test will be carried out using data from a random sample of n 5 10 independent success (S) or failure (F) observations. Because the sample size is small, the large-sample z test is not an appropriate way to test these hypotheses. The exact binomial test does not require a large sample, so it can be used when the sample size is small and the sampling distribution of p / may not be approximately normal. Suppose that we observe x 5 8 successes in the sample of size n 5 10. This means that p / 5 8/10 5 0.8. The binomial distribution with n 5 10 and p 5 0.50 (the hypoth-esized proportion) can be used to calculate the probability of observing a sample propor-tion as or more extreme than what was observed in the sample. This is the P -value for the hypothesis test. To calculate the P -value for an exact binomial test, use the Shiny app “Exact Binomial Test for One Proportion.” This is one of the Shiny web apps that accompany this text. These web apps are located at statistics.cengage.com/PSO6e/Apps.html .
  • Book cover image for: Biostatistics
    eBook - PDF

    Biostatistics

    Basic Concepts and Methodology for the Health Sciences, 10th Edition International Student Version

    • Wayne W. Daniel, Chad L. Cross(Authors)
    • 2014(Publication Date)
    • Wiley
      (Publisher)
    Explain the difference between the power curves for one-sided tests and two-sided tests. 6. Explain how one decides what statement goes into the null hypothesis and what statement goes into the alternative hypothesis. 7. What are the assumptions underlying the use of the t statistic in testing hypotheses about a single mean? The difference between two means? 8. When may the z statistic be used in testing hypotheses about (a) a single population mean? (b) the difference between two population means? (c) a single population proportion? (d) the difference between two population proportions? 9. In testing a hypothesis about the difference between two population means, what is the rationale behind pooling the sample variances? 10. Explain the rationale behind the use of the paired comparisons test. 11. Give an example from your field of interest where a paired comparisons test would be appropriate. Use real or realistic data and perform an appropriate hypothesis test. 12. Give an example from your field of interest where it would be appropriate to test a hypothesis about the difference between two population means. Use real or realistic data and carry out the ten-step hypothesis testing procedure. 13. Do Exercise 12 for a single population mean. 14. Do Exercise 12 for a single population proportion. 15. Do Exercise 12 for the difference between two population proportions. 16. Do Exercise 12 for a population variance. 17. Do Exercise 12 for the ratio of two population variances. 18. Ochsenk€ uhn et al. (A-33) studied birth as a result of in vitro fertilization (IVF) and birth from spontaneous conception. In the sample, there were 163 singleton births resulting from IVF with a mean birth weight of 3071 g and sample standard deviation of 761 g. Among the 321 singleton births resulting from spontaneous conception, the mean birth weight was 3172 g with a standard deviation of 702 g.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.