Mathematics
Confidence Interval for the Difference of Two Means
A confidence interval for the difference of two means is a statistical range that provides an estimate of the true difference between the means of two populations. It is calculated using sample data and a specified level of confidence, allowing researchers to make inferences about the population means. This interval helps to quantify the uncertainty associated with the difference in means.
Written by Perlego with AI-assistance
Related key terms
1 of 5
12 Key excerpts on "Confidence Interval for the Difference of Two Means"
- eBook - PDF
- Timothy DelSole, Michael Tippett(Authors)
- 2022(Publication Date)
- Cambridge University Press(Publisher)
These confidence intervals assume the population has a normal distribution. If the popu- lation is not Gaussian, or the quantity being inferred is complicated, then bootstrap methods offer an important alternative approach, as discussed at the end of this chapter. 1 Reprinted by permission from International Committee of Medical Journal Editors (2019). 52 3.2 Confidence Interval for a Difference in Means 53 3.1 The Problem Suppose an experiment is performed under two different conditions. Let ˆ μ X and ˆ μ Y be the average of the data collected under the two conditions. Then, ˆ μ X − ˆ μ Y is an estimate of the difference in population means μ X − μ Y . However, the estimate ˆ μ X − ˆ μ Y is computed from a finite amount of data and therefore differs from μ X − μ Y by a random error. The error may be so large that even the sign of the difference is uncertain. How can this uncertainty be quantified? One approach is based on hypothesis testing. In Chapter 2, we learned a test for the hypothesis that the difference in means equals zero. What if we want to test other hypothetical values, such as μ X − μ Y = 2? More generally, can we identify the range of values of μ X − μ Y that would not be rejected based on the data? The confidence interval addresses this kind of question. 3.2 Confidence Interval for a Difference in Means We begin by deriving a confidence interval for a difference in means. Let X 1 , . . . , X N X be independently and identically distributed (iid) as a normal distribution with mean μ X and variance σ 2 , and let Y 1 , . . . ,Y N Y be iid as a normal distribution, but with mean μ Y and variance σ 2 . The two populations have the same variance but possibly different means. An estimate of the difference in population means is the difference in sample means ˆ μ X − ˆ μ Y . - Abbas F.M. Alkarkhi, Abbas F. M. Alkarkhi(Authors)
- 2020(Publication Date)
- Elsevier(Publisher)
Consider that Y 1 and Y 2 are two variables of interest, n 1 and n 2 represent the sample sizes selected from population 1 and population 2, respectively, with Y ¯ 1 and Y ¯ 2 to represent the average value, and σ 1 2 and σ 2 2 to represent the variance for populations 1 and 2, respectively. A confidence interval for the difference between two population means and specific confidence level (1 − α) can be computed employing the sample data. The mathematical formula for computing the confidence interval for the difference between two means is presented in Eq. (10.1). (Y ¯ 1 − Y ¯ 2) ± Z α 2 σ 1 2 n 1 + σ 2 2 n 2 (10.1) (10.1) The confidence interval can be written using another. form: (Y ¯ 1 − Y ¯ 2) − Z α 2 σ 1 2 n 1 + σ 2 2 n 2 ≤ μ 1 − μ 2 ≤ (Y ¯ 1 − Y ¯ 2) + Z α 2 σ 1 2 n 1 + σ 2 2 n 2 where Y ¯ 1 − Y ¯ 2 represents the observed difference between the two-sample means; μ 1 − μ 2 represents the expected difference; μ represents the population mean; Z α 2 represents. the Z critical value; and Z α 2 σ 1 2 n 1 + σ 2 2 n 2 represents the margin of error (E). Note • When the sample size is large and the population standard deviation (σ) is not provided, then we can use the sample standard deviation (S). • The margin of error (E) can be used to write the confidence interval. (Y ¯ 1 − Y ¯ 2) − E ≤ μ 1 − μ 2 ≤ (Y ¯ 1 − Y ¯ 2) + E • The lower one-tailed confidence interval is given in Eq. (10.2) : (Y ¯ 1 − Y 2) − Z α 2 σ 1 2 n 1 + σ 2 2 n 2 ≤ μ 1 − μ 2 (10.2) (10.2) or it can be written as the interval [ (Y ¯ 1 − Y ¯ 2) − Z α 2 σ 1 2 n 1 + σ 2 2 n 2, ∞) • The upper one-tailed confidence interval is given in Eq. (10.3) : μ 1 − μ 2 ≤ (Y ¯ 1 − Y ¯ 2) + Z α 2 σ 1 2 n 1 + σ 2 2 n 2 (10.3) (10.3) or it can be written as the interval (− ∞, (Y ¯ 1 − Y ¯ 2) + Z α 2 σ 1 2 n 1 + σ 2 2 n 2 ] Example 10.1 The amount of solid waste in two palm oil mills An environmentalist wants to build a range for the difference in the amount of solid waste generated in two palm oil mills- No longer available |Learn more
- Jessica Utts, Robert Heckard(Authors)
- 2015(Publication Date)
- Cengage Learning EMEA(Publisher)
Because the multiplier is t * and two samples are used, the result is sometimes called a two-sample t -interval . The general format of a confidence interval for the dif-ference in two means is therefore the following: Difference in sample means t * Standard error The standard error of the difference in sample means is s .e. x 1 x 2 s 2 1 n 1 s 2 2 n 2 Unfortunately, there is a muddy mathematical story underneath the calculation of a confidence interval for the difference between two population means. On the surface, however, the story appears to be easy if we are satisfied with an approximate interval, and can be summarized as follows. TI-84 TIP Confidence Interval for a Mean or the Mean of Paired Differences • Press STAT and scroll horizontally to TESTS . Scroll vertically to 8:Tinterval and press ENTER . • If the individual data values are stored in a list (e.g., L1) select Data as the input method. Enter the data list name (say, L1) and a confidence level. The entry for Freq: should be 1, the default value. Scroll to Calculate and press ENTER . • For paired data, enter the first value for each pair in list L1 and the second value for each pair in list L2. With a clean home screen use the keystrokes 2 nd 1 – 2 nd 2 STO 2 nd 3 ENTER to create and store the differences in list L3. Then proceed as above, using L3 as the data list. • If summary statistics are already known, select Stats as the input method. Enter values for the sample mean, standard deviation, sample size, and a confidence level. Scroll to Calculate and press ENTER . CI Module 5: 1 2 Copyright 2014 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. - eBook - PDF
- Sally Caldwell(Author)
- 2012(Publication Date)
- Cengage Learning EMEA(Publisher)
Learning to navigate your way through various approaches to the same type of question, different systems of symbolic notation, or encounters with personal preferences can provide an added boost to your overall level of statistical understanding. Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 150 CHAPTER 6 Confidence Intervals Key Terms confidence interval for the mean family of t distributions confidence interval for a proportion level of confidence estimate of the standard error of margin of error the mean CourseMate brings course concepts to life with interac-tive learning, quizzing, flashcards, extra problem sets, and exam prepara-tion tools that support Statistics Unplugged. Logon and learn more at www.cengagebrain.com , which is your gateway to all of this text’s compli-mentary and premium resources. Chapter Problems Fill in the blanks, calculate the requested values, or otherwise supply the correct answer. General Thought Questions 1. A confidence interval for the mean is calculated by adding and subtracting a value to and from the sample . 2. The purpose of constructing a confidence interval for the mean is to the true value of the population mean, based upon the mean of a . 3. A confidence interval for the mean is an interval within which you believe the of the population is located. 4. As the level of confidence increases, the precision of your estimate . 5. There is a(n) relationship between level of confidence and pre-cision of the estimate. - eBook - ePub
- Daniel B Wright, Kamala London(Authors)
- 2009(Publication Date)
- SAGE Publications Ltd(Publisher)
Usually there would be enough information (i.e., the mean, standard deviation and sample size) so that the reader could calculate the confidence interval. During the past decade many journal editors have stressed that confidence intervals should always be reported. This has affected practice and will continue to do so (Cumming et al., 2007). Both the American Psychological Association (Wilkinson et al., 1999) and the British Psychological Society (Wright, 2003) have endorsed the use of confidence intervals. After reading this chapter you should have a conceptual grasp of why confidence intervals are used and be able to explain to other people what they mean when reported in newspapers. Further, you should be able to calculate the confidence interval for a mean, for the difference in means between two variables, for the difference in means between two groups, for the median, and for the difference in medians between two groups. EXERCISES 5.1 Look in a newspaper to find (a) where a confidence interval is used and describe what it means, and (b) where an estimate is produced without a confidence interval, and describe what additional information the confidence interval would have provided. 5.2 For the coffee preference example from earlier in this chapter, calculate the 99% confidence interval for the difference between the means. What conclusions do you make? Are the conclusions different from those made with the 95% confidence interval, and if so why? 5.3 Thirty-six female undergraduates were given a 15-minute math test in groups of three (Inzlicht & Ben-Zeev, 2000). Half were in a group with two other females. Their mean was 70% correct. Half took the test in a room with two male confederates. Their mean was 55% correct. For both the standard deviation was about 20%. Calculate the 95% confidence intervals for these two groups, and for the difference between the two - eBook - ePub
Statistical Inference
A Short Course
- Michael J. Panik(Author)
- 2012(Publication Date)
- Wiley(Publisher)
lies between −1.6530 and 6.2530.Example 11.4[based on Equation (11.7) ]Let us take random samples of size n X = 15 and n Y = 20 from independent normal X and Y populations, respectively. From these samples, we calculateThenand, from Equation (11.5) ,For α = 0.05,and thus, from Equation (11.7) , our 95% “small-sample” confidence limits for μX − μY appear asor −3 ± 8.9388. Hence a 95% confidence interval for μX − μY is (−11.9388, 5.9388), that is, we may be 95% confident that the difference μX − μY lies between −11. 94 and 5.94.Example 11.5[based on Equation (11.10) ]Suppose we extract random samples from independent normal X and Y populations, respectively:X Sample Values Y Sample Values 10, 6, 13, 5, 11, 10, 14 13, 14, 9, 3, 1, 10, 4, 8, 8 Our objective is to determine a 95% confidence interval for μX − μY . HereThen With α = 0.05 and two obviously “small samples,”where, from Equation (11.9) ,Then from Equation (11.10) , a 95% confidence interval for μX − μY isor 2.08 ± 4.243. Hence we may be 95% confident that μX − μY lies within the interval (–2.163, 6.323). How precisely have we estimated μX − μY ? We are within ±4.243 units of μX − μY with 95% reliability. Alternatively stated, we may be 95% confident that will not differ from μX − μY by more than ±4.243 units.11.2 Confidence Intervals for the Difference of Means When Sampling from Two Dependent Populations: Paired Comparisons
The inferential techniques developed in the preceding section were based upon random samples taken from two “independent” normal populations. What approach should be taken when the samples are either intrinsically or purposefully designed to be “dependent?” A common source of dependence is when the samples from the two populations are paired - eBook - PDF
- Prem S. Mann(Author)
- 2020(Publication Date)
- Wiley(Publisher)
Construct a 99% confidence interval for the mean μ d of the population of paired differences, where a paired difference is equal to Zeke’s estimate minus Elmer’s estimate. b. Test at a 5% significance level whether the mean μ d of the pop- ulation of paired differences is different from zero. Assume that the population of paired differences is (approximately) normally distributed. 7. A sample of 500 male registered voters showed that 57% of them voted in the last presidential election. Another sample of 400 female registered voters showed that 55% of them voted in the same election. a. Construct a 97% confidence interval for the difference be- tween the proportions of all male and all female registered voters who voted in the last presidential election. b. Test at a 1% significance level whether the proportion of all male voters who voted in the last presidential election is different from that of all female voters. Mini-Projects Mini-Project 10.1 Suppose that a new cold-prevention drug was tested in a randomized, placebo-controlled, doubleblind experiment during the month of January. One thousand healthy adults were randomly divided into two groups of 500 each—a treatment group and a control group. The treatment group was given the new drug, and the control group re- ceived a placebo. During the month, 40 people in the treatment group and 120 people in the control group caught a cold. Explain how to construct a 95% confidence interval for the difference between the rel- evant population proportions. Also describe an appropriate hypothe- sis test, using the given data, to evaluate the effectiveness of this new drug for cold prevention. Find a similar article in a journal of medicine, psychology, or other field that lends itself to confidence intervals and hypothesis tests for differences in two means or proportions. First explain how to make the confidence intervals and hypothesis tests; then do so using the data given in the article. - eBook - PDF
- Ned Freed, Stacey Jones, Timothy Bergquist(Authors)
- 2013(Publication Date)
- Wiley(Publisher)
■ In approximately 95.5% of the cases, the actual population mean difference will be within two standard deviations of the sample mean difference. ■ And so forth. Building the Interval Translated into our standard interval format, an interval estimate of the difference between two population means has the form ( x 1 x 2 ) z x 1 x 2 where x 1 x 2 the difference in sample means z z-score from the standard normal distribution for any given level of confidence FIGURE 8.3 Sampling Distribution of the Sample Mean Difference The sampling distribution will be approximately normal and centered on the difference between the population means. x 1 x 2 1 2 x 1 x 2 1 2 n 1 2 2 n 2 8.2 Estimating the Difference between Two Population Means (Independent Samples) 289 x 1 x 2 standard deviation (standard error) of the sampling distribution of the sample mean difference Since the standard deviation (standard error) of the sampling distribution can be com- puted as x 1 x 2 B 2 1 n 1 2 2 n 2 where 1 standard deviation of population 1, n 1 size of the sample from population 1 2 standard deviation of population 2, n 2 size of the sample from population 2 we can show the interval as If the population standard deviations are unknown—which, in practice, is almost always the case—sample standard deviations s 1 and s 2 can be directly substituted for 1 and 2 , so long as sample sizes are “large.” In this case, large means that both sample sizes are greater than 30. Although, technically, the substitution of s for would call for using the t distribution to build the interval, we can use the normal z-score in large sample cases since, as we saw in Chapter 7, t and z start to match pretty closely as sample size—and so, degrees of freedom— increase. - eBook - PDF
Business Statistics
For Contemporary Decision Making
- Ken Black(Author)
- 2020(Publication Date)
- Wiley(Publisher)
The following diagram shows the critical values, the rejection regions, and the observed value for this problem. 10.4 Statistical Inferences About Two Population Proportions, p 1 − p 2 353 Confidence Intervals Sometimes in business analytics the investigator wants to estimate the difference in two population proportions. For example, what is the difference, if any, in the population proportions of workers in the Midwest who favor union membership and workers in the South who favor union membership? In studying two different suppliers of the same part, a large manufacturing company might want to estimate the difference between suppliers in the proportion of parts that meet specifications. These and other situations requiring the estimation of the difference in two population proportions can be solved by using confidence intervals. The formula for constructing confidence intervals to estimate the difference in two pop-ulation proportions is a modified version of Formula 10.9. Formula 10.9 for two proportions requires knowledge of each of the population proportions. Because we are attempting to esti-mate the difference in these two proportions, we obviously do not know their value. To over-come this lack of knowledge in constructing a confidence interval formula, we substitute the sample proportions in place of the population proportions and use these sample proportions in the estimate, as follows. z = ( p ̂ 1 − p ̂ 2 ) − ( p 1 − p 2 ) _________________ √ _____________ p ̂ 1 · q ̂ 1 _ n 1 + p ̂ 2 · q ̂ 2 _ n 2 Solving this equation for p 1 − p 2 produces the formula for constructing confidence inter-vals for p 1 − p 2 . - eBook - PDF
- D.G. Rees(Author)
- 2018(Publication Date)
- Chapman and Hall/CRC(Publisher)
The sample size, n — the larger the sample size, the smaller the error term and the smaller the width of the interval (other things being equal). 2. The variability of height (as measured by the standard deviation) — the more variable the height, the larger the standard deviation, the greater the error term, and the greater the width of the interval. 3. The level of confidence we wish to have that the population mean height does in fact lie within the specified interval — the greater the confidence, the greater the error term and the greater the width of the interval. This third factor, confidence, is such an important concept that we will devote the next section to it. Note The three statements above are quoted without proof. I hope that you can at least accept them as being 'intuitively reasonable’. 9.2 95% Confidence Intervals For the student height example, the population mean height, jjl , has a fixed numerical value at any given time. This value is unknown to us, but by taking a random sample of 27 heights, we want to specify an interval within which we are reasonably confident that this fixed unknown value lies. Suppose we decide that nothing less than 100% confidence will suffice, implying absolute certainty. Unfortunately, theory indicates that the 100% confidence interval is so wide that it is useless for all practical purposes. Instead, statisticians conventionally choose a 95% confidence level and calculate a 95% confidence interval for the population mean, jjl , Confidence Interval Estimation ■ 117 using formulae we shall introduce in the next sections. For the moment, it is important for you to understand what a confidence level of 95% means. It means that on 95% of occasions when such intervals are calculated the population mean will actually fall inside the interval we have calculated from the sample data. On the other 5% of occasions, it will fall outside the interval. - eBook - PDF
Business Statistics
For Contemporary Decision Making
- Ken Black(Author)
- 2023(Publication Date)
- Wiley(Publisher)
370 CHAPTER 10 Statistical Inferences About Two Populations t Formula to Test the Difference in Means t = ( _ x 1 − _ x 2 ) − (μ 1 − μ 2) _________________ √ _______ s 1 2 __ n 1 + s 2 2 __ n 2 df = [ s 1 2 _ n 1 + s 2 2 _ n 2 ] 2 ______________ ( s 1 2 __ n 1 ) 2 ______ n 1 − 1 + ( s 2 2 __ n 2 ) 2 ______ n 2 − 1 Confidence interval for estimating the difference in two independent means and population variances unknown but assumed to be equal (assume also that two populations are normally distributed) ( ¯ x 1 − ¯ x 2 ) ± t √ __________________ s 1 2 (n 1 − 1) + s 2 2 (n 2 − 1) ___________________ n 1 + n 2 − 2 √ _______ 1 _ n 1 + 1 _ n 2 df = n 1 + n 2 − 2 t test for the difference in two related samples (the differences are nor- mally distributed in the population) t = ¯ d − D ______ s d ____ √ __ n df = n − 1 Formulas for _ d and s d ¯ d = Σd _ n s d = √ _________ Σ (d − ¯ d) 2 _________ n − 1 = √ ___________ Σd 2 − (Σd) 2 _ n ______________ n − 1 Confidence interval formula for estimating the difference in related samples (the differences are normally distributed in the population) _ d − t s d _ √ _ n ≤ D ≤ _ d + t s d _ √ _ n df = n − 1 z formula for testing the difference in population proportions z = ( p̂ 1 − p̂ 2 ) − ( p 1 − p 2 ) __________________ √ ______________ ( ¯ p· ¯ q ) ( 1 __ n 1 + 1 _ n 2 ) where ¯ p = x 1 + x 2 _ n 1 + n 2 = n 1 p ̂ 1 + n 2 p ̂ 2 __________ n 1 + n 2 and ¯ q = 1 − ¯ p Confidence interval to estimate p 1 − p 2 ( p̂ 1 − p̂ 2 ) − z √ ____________ p̂ 1 · q̂ 1 _____ n 1 + p̂ 2 · q̂ 2 _____ n 2 ≤ p 1 − p 2 ≤ ( p̂ 1 − p̂ 2 ) + z √ ____________ p̂ 1 · q̂ 1 _____ n 1 + p̂ 2 · q̂ 2 _____ n 2 F test for two population variances (assume the two populations are normally distributed) F = s 1 2 _ s 2 2 df numerator = v 1 = n 1 − 1 df denominator = v 2 = n 2 − 1 Formula for determining the critical value for the lower-tail F F 1−α, v 2 , v 1 = 1 _ F α, v 1 , v 2 Supplementary Problems Calculating the Statistics 10.45. - Ken Black, Tiffany Bayley, Ignacio Castillo(Authors)
- 2023(Publication Date)
- Wiley(Publisher)
4. As sample sizes increase (keeping confidence level the same), the width of the confidence interval for the difference in two population means decreases. 10.1. Odd-Numbered Problems 10.1. a. z = −1.01, fail to reject b. −2.41 c. .1562 10.3. a. z = 5.48, reject b. 4.04 ≤ μ ≤ 10.02 10.5. −1.86 ≤ μ ≤ −0.54 10.7. z = −2.32, fail to reject 10.9. z = 2.27, reject Concept Check 10.2 1 Use a z formula when population variances are known, and a t for- mula when population variances are unknown. 2. No, because an assumption underlying this technique is that the measurement or characteristic being studied is normally distributed for both populations. 3. Formula 10.3 assumes the population variances are equal. For- mula 10.4 assumes the population variances are not equal. 4. When the population variances are unknown and assumed to be equal, increasing the sample sizes decreases the margin of error; hence, the confidence interval decreases. 10.2. Odd-Numbered Problems 10.11. t = −1.05, fail to reject 10.13. t = 4.64, reject 10.15. −10,021.81 ≤ μ 1 − μ 2 ≤ 4,021.81 10.17. t = 2.06, reject 10.19. t = 5.15, reject, 2,374.79 ≤ μ 1 − μ 2 ≤ 5,607.03 Concept Check 10.3 1. Assessing the benefits of a new drug by evaluating the same patients before and after taking the drug. Measuring crop growth with nonorganic fertilizer on half a plot land, the other half using organic fertilizer. 2. Data collected from dependent samples are related: measurements are taken “before” and “after” an experiment on the same subjects or data collected are from a matched pair of samples. Data collected from independent samples are unrelated. 3. The key assumption regarding the distribution of the differences of the two populations is that the differences are normally distributed or n ≥ 30. 4. Before the promotion is aired, soft drink sales are recorded daily for a period of time. After the promotion, soft drink sales are recorded daily for the same period of time.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.











