Mathematics
Errors in Hypothesis Testing
Errors in hypothesis testing refer to the potential mistakes that can occur when making conclusions about a population based on sample data. Type I error occurs when a true null hypothesis is rejected, while Type II error occurs when a false null hypothesis is not rejected. These errors are important to consider when interpreting the results of hypothesis tests.
Written by Perlego with AI-assistance
Related key terms
1 of 5
11 Key excerpts on "Errors in Hypothesis Testing"
- eBook - ePub
Dosage Form Design Parameters
Volume II
- (Author)
- 2018(Publication Date)
- Academic Press(Publisher)
Chlorophytum borivilianum (Musli) and supplies them to a pharmaceutical company.7.7.1 Type I Error
Deciding that the fertilizer is effective when it is not. This will cause the farmer to spend more money on fertilizer and not be able to sell any more musli. These are also known as “false positive”: the error of rejecting a null hypothesis when it is actually true. In other words, this is the error of accepting an alternative hypothesis (the real hypothesis of interest) when the results can be attributed to chance. It occurs when we are observing a difference when in truth there is none (or more specifically, no statistically significant difference).7.7.2 Type II Error
Deciding not to use the fertilizer when it is really effective. This will cause the farmer to lose out on potential sales. These are also known as a “false negative”: the error of not rejecting a null hypothesis when the alternative hypothesis is the true state of nature. In other words, this is the error of failing to accept an alternative hypothesis when you do not have adequate power. A sensible statistical procedure should result in a small chance of making both Type I and Type II errors. Table 7.2 shows the various types of errors and their possible reasons.Table 7.2 Various Types of Errors and Their Possible ReasonsType of error Possible reason for emergence of error References Population specification error Selection of incorrect population for measurement of data Heckman (1979) Selection error Occurs when nonprobability methods select samples Shah and Samworth (2013) Sampling error Occurs when samples do not completely represent the population Staples et al. (2004) - Gerry P. Quinn, Michael J. Keough(Authors)
- 2002(Publication Date)
- Cambridge University Press(Publisher)
What about the two errors? • A Type I error is when we mistakenly reject a correct H 0 (e.g. when we conclude from our sample and a t test that the population parameter is not equal to zero when in fact the population parameter does equal zero) and is denoted . A Type I error can only occur when H 0 is true. • A Type II error is when we mistakenly accept an incorrect H 0 (e.g. when we conclude from 42 HYPOTHESIS TESTING F ig ure 3.2 . Stat i st i cal dec i s i ons and errors when test i n g null hypotheses . POPULATION SITUATION Correct decision Effect detected Type II error Effect not detected Type I error Effect detected; none exists STATISTICAL CONCLUSION Reject H 0 Retain H 0 Effect No effect Correct decision No effect detected; none exists our sample and a t test that the population parameter equals zero when in fact the population parameter is different from zero). Type II error rates are denoted by and can only occur when the H 0 is false. Both errors are the result of chance. Our random sample(s) may provide misleading infor-mation about the population(s), especially if the sample sizes are small. For example, two popula-tions may have the same mean value but our sample from one population may, by chance, contain all large values and our sample from the other population may, by chance, contain all small values, resulting in a statistically significant difference between means. Such a Type I error is possible even if H 0 ( 1 2 ) is true, it’s just unlikely. Keep in mind the frequency interpreta-tion of P values also applies to the interpretation of error rates. The Type I and Type II error prob-abilities do not necessarily apply to our specific statistical test but represent the long-run prob-ability of errors if we repeatedly sampled from the same population(s) and did the test many times.- No longer available |Learn more
- (Author)
- 2014(Publication Date)
- Orange Apple(Publisher)
________________________ WORLD TECHNOLOGIES ________________________ Chapter- 8 Type I & Type II Errors and Uniformly Most Powerful Test Type I & type II errors In statistical hypothesis testing, there are two types of errors that can be made or incorrect conclusions that can be drawn. If a null hypothesis is incorrectly rejected when it is in fact true, this is called a Type I error (also known as a false positive). A Type II error (also known as a false negative) occurs when a null hypothesis is not rejected despite being false. The Greek letter α is used to denote the probability of type I error, and the letter β is used to denote the probability of type II error. Statistical error: Type I and Type II Statisticians speak of two significant sorts of statistical error. The context is that there is a null hypothesis which corresponds to a presumed default state of nature, e.g., that an individual is free of disease, that an accused is innocent. Corresponding to the null hypothesis is an alternative hypothesis which corresponds to the opposite situation, that is, that the individual has the disease, that the accused is guilty. The goal is to determine accurately if the null hypothesis can be discarded in favor of the alternative. A test of some sort is conducted and data are obtained. The result of the test may be negative (that is, it does not indicate disease, guilt). On the other hand, it may be positive (that is, it may indicate disease, guilt). If the result of the test does not correspond with the actual state of nature, then an error has occurred, but if the result of the test corresponds with the actual state of nature, then a correct decision has been made. There are two kinds of error, classified as type I error and type II error, depending upon which hypothesis has incorrectly been identified as the true state of nature. - Lise DeShea, Larry E. Toothaker(Authors)
- 2015(Publication Date)
- Chapman and Hall/CRC(Publisher)
This decision could have been a Type I error, which occurs when we reject the null hypothesis when actually the null hypothesis is true in the popula-tion. This question says “significantly,” which means a null hypothesis was rejected. The only kind of error possible when we reject a null hypothesis is a Type I error. Table 9.1 Two Possible Realities, Two Possible Decisions True State of the World H 0 is True H 0 is False Decision on a Given Hypothesis Test Reject H 0 This combination is a Type I error This combination is a correct decision Retain H 0 This combination is a correct decision This combination is a Type II error 245 Probability of a Type I Error It may be alarming to you to think that we could be making a mistake and drawing erroneous conclusions whenever we perform a hypothesis test. It is pos-sible because we depend on probability to test hypotheses. The next section will begin our discussion of the probabilities of errors and correct decisions, includ-ing the ways that researchers try to limit the chances of errors and at the same time try to increase the chances of correct decisions. Probability of a Type I Error It may surprise you to learn that we already have talked about the probability of committing a Type I error in a hypothesis test. Let’s take a look at an earlier figure, reproduced here as Figure 9.1. Figure 9.1 shows a standard normal distribution reflecting a reality in which the null hypothesis is true. To refresh your memory: we used this figure in the rat shipment example, where the null hypothesis said the population mean maze completion time was less than or equal to 33 seconds. We believed that we were sampling from a population of rats that would take longer than 33 seconds on average to complete the maze, so we had written an alternative hypothesis that said H 1 : µ > 33.- Bruce M. King, Patrick J. Rosopa, Edward W. Minium(Authors)
- 2018(Publication Date)
- Wiley(Publisher)
In the meantime, we urge you to include such measures in your reports. After all, the most important product of a statistical test is not whether one rejects the null hypothesis, which we all know is probably false anyway, but the magnitude of the difference between hyp and true . Grissom and Kim (2012) pro- vide in depth coverage of effect size measures for various statistical procedures including those involving quantitative and qualitative outcomes. 210 CHAPTER 14. INTERPRETING THE RESULTS OF HYPOTHESIS TESTING 14.3 Errors in Hypothesis Testing There are, so to speak, two “states of nature”: Either the null hypothesis, H 0 , is true or it is false. Similarly, there are two possible decisions: we can reject the null hypothesis or we can retain it. Taken in combination, there are four possibilities. They are diagramed in Table 14.1. If the null hypothesis is true and we retain it, or if it is false and we reject it, we have made a correct decision. Otherwise, we have made an error. Notice that there are two kinds of errors. They are called Type I error and Type II error. If we reject H 0 and in fact it is true, we have made a Type I error. Consider the picture of Type I error rejection of a true null hypothesis the sampling distribution shown in Figure 14.1 (for simplicity’s sake, we consider the sampling distribution of X rather than t). It illustrates a situation that might arise if we were testing the hypothesis H 0 : X = 150 and using a two-tailed test at the 5% significance level. Suppose our obtained sample mean is 146, which leads us to reject H 0 . The logic in rejecting H 0 is that if the null hypothesis is true, a sample mean this deviant would occur less than 5% of the time. There- fore, it seems more reasonable for us to believe that this sample mean came from a population with a mean different from that specified in H 0 .- Robert M. Leekley(Author)
- 2010(Publication Date)
- CRC Press(Publisher)
And the rules of evidence are intended to assure a low prob-ability that you will reject the null hypothesis when it is true. This sort of error—rejecting the null hypothesis when it is true—is called a type I error , and its probability is α . The Truth Innocent Guilty Your H o : Innocent Correct (1 – α) Type II error (β) Decision H a : Guilty Type I error (α) Correct (1 – β) Figure 8 .1 Decision making with incomplete information. 168 Applied Statistics for Business and Economics We want α to be small. Still, we need to recognize that the harder we make it to convict an innocent person, the harder we make it to convict a guilty person too. This sort of error—failing to reject the null hypothesis when it is false—is called a type II error , and its probability is β . And, for a given amount of information, the smaller we make α , the larger we make β . Notice that we either reject H o or we fail to reject H o . If we fail to reject H o , it could be that this is because H o is true. It could also be because there was simply not enough evidence. You may not believe that the defendant is actually innocent. But if there is not enough evidence to establish his guilt beyond a reasonable doubt, you fail to convict. 8 .2 A Two-Tailed Test for the Population Proportion 8.2.1 The Null and Alternative Hypotheses Suppose someone claims that 12% of all college students are left-handed. This is a claim about a population parameter, π LH , which may or may not be true. We can test it in much the same way we tested the claim of inno-cence in the trial. A claim that we are “testing” must be the null hypoth-esis. The null hypothesis is H o : π LH = 0.12. The alternative hypothesis is what we will believe if we succeed in rejecting the null hypothesis. A general, “two-tailed” alternative would be simply that the null hypothesis is wrong. That is, H a : π LH ≠ 0.12. The first step in testing hypotheses is to always write down your null and alternative hypotheses.- eBook - ePub
Statistical Misconceptions
Classic Edition
- Schuyler Huck, Schuyler W. Huck(Authors)
- 2015(Publication Date)
- Routledge(Publisher)
Hypothesis TestingA common misconception among beginning researchers is the notion that if you reject a null hypothesis you have “proven” your research hypothesis. *Unfortunately, many researchers have the misconception that alpha and beta are inversely related, and believe that if they set alpha at a low level, beta would automatically become high. †8.1 Alpha and Type I Error Risk
The Misconception
Alpha, the level of significance, defines the probability of a Type I error. For example, if a is set equal to .05, there will then necessarily be a 5% chance that a true null hypothesis will be rejected.Evidence That This Misconception Exists *
The first and second of these statements come from online documents called Testing Hypotheses and Statistics Glossary, respectively. The third statement comes from a college textbook dealing with business statistics. (Note the word always in the first and third passages.)- The probability of type I error is always equal to the level of significance that is used as the standard for rejecting the null hypothesis; it is designated by α and thus α also designates the level of significance.
- This probability of a type I error can be precisely computed as P(type I error) = significance level = α.
- The maximum probability of Type I error is designated by the Greek α (alpha). It is always equal to the levels of significance used in testing the null hypothesis.
Why This Misconception Is Dangerous
In most scientific fields, researchers are taught to conduct their empirical studies in such a way that Type I errors have a low probability of occurrence.† - eBook - ePub
- Schuyler W. Huck(Author)
- 2008(Publication Date)
- Routledge(Publisher)
There are two main reasons why the selected level of significance can be misleading as to the chance of a Type I error. One of these concerns underlying assumptions. The other deals with the number of tests being conducted.If one or more of the assumptions underlying a statistical test are violated, the actual probability of a Type I error can be substantially higher or lower than the nominal level of significance. For example, if a t-test comparing the means from two unequally sized samples is conducted with α = .05, and if the assumption of equal population variances does not hold true, the actual probability of a Type I error can be greater than .35. That's seven times higher than the nominal level of significance! The Type I error rate can also be far smaller than .05.†A statistical test is said to be robust if it functions as it should, even if its underlying assumptions are violated. However, certain statistical tests are robust only in specific situations (e.g., when sample sizes are large and equal), while other statistical tests are never robust if their assumptions are violated.*Even if a statistical test's underlying assumptions are valid, it still is possible for the level of significance to understate Type I error risk. This will happen if the test is applied more than once. Within the full set of tests being conducted, the probability of a Type I error occurring in one or more of the tests will exceed α, even if each test is conducted with a level of significance set equal to α.The phrase “inflated Type I error rate” describes this situation. A coin-flipping analogy may help to illustrate why the chances of a Type I error get elevated over α in the situation where multiple tests are conducted. Let's consider flipping a fair coin, and let's further consider that it's bad to end up with the coin landing on its tails side. If we flip the coin just once, the probability of a bad result is .50. But what if we flip the coin twice? Now, the probability of getting tails (on the first flip, on the second flip, or on both flips) is .75. If the first coin is flipped 10 times, the probability of getting at least 1 tail increases to .999. - eBook - PDF
- Gregory Francis(Author)
- 2019(Publication Date)
- Cambridge University Press(Publisher)
Remember, the t value we use in an hypothesis test is just the d 0 of the sampling distributions. This discussion is mostly for pedagogical purposes. We cannot, for example, adjust the criterion to reduce bias in hypothesis testing because we do not know the true effect size for the alternative hypothesis. We should note, however, that Signal Detection Theory emphasizes that the choice of the criterion always involves a trade-off between hits and false alarms. For example, recent calls (Benjamin et al., 2018) to reduce the desired Type I error from the typical 0.05 to 0.005 should have the benefit of decreasing Type I errors (false alarms), but at the cost of decreasing power (hits). Take away message: In terms of Signal Detection Theory, hypothesis testing tends to be biased against the alternative hypothesis. This bias seems appropriate given that the goal of hypothesis testing is to fix the rate of Type I errors. 0 –0.5 0 0.5 Bias No bias Power 1 1.5 2 0.2 0.4 0.6 0.8 Type I error rate 0.01 0.05 0.10 1 Figure 14 The bias of hypothesis testing as a function of power for three Type I error rates. In most cases a hypothesis test is biased against the alternative hypothesis. 44 Hypothesis Testing Reconsidered 15 Conclusions Hypothesis testing is a common method of analyzing experimental data. When everything works well, it is easy to understand the appeal. Being able to control the Type I error rate is a good thing, and the methods are (generally) easy to apply. However, we have seen that deviating just a bit from the textbook examples can cause serious problems with hypothesis-testing approaches. Sampling, analysis strategies, reporting, and measurement issues can cause hypothesis testing to produce much higher Type I error rates than might be expected. At the same time, some standard hypothesis-testing strategies produce Type I error rates much lower than what is intended, thereby making it difficult for scientists to convince their peers about a new discovery. - Phillip I. Good, James W. Hardin(Authors)
- 2006(Publication Date)
- Wiley-Interscience(Publisher)
The chief errors in practice lie in failing to report all of the following: CHAPTER 5 TESTING HYPOTHESES: CHOOSING A TEST STATISTIC 79 8 Examples include StatXact® from http://www.cytel.com, RT from www.west-inc.com, NPC Test from www.methodologica.it., and R (freeware) from http://www.r-project.org/. • Whether we used a one-tailed or two-tailed test and why • Whether the categories are ordered or unordered • Which statistic was employed and why Chapter 10 contains a discussion of a final, not inconsiderable, source of error, the neglect of confounding variables that may be responsible for creating an illusory association or concealing an association that actually exists. INFERIOR TESTS Violation of assumptions can affect not only the significance level of a test but the power of the test, as well; see Tukey and MacLaughlin (1963) and Box and Tiao (1964). For example, although the significance level of the t-test is robust to departures from normality, the power of the t-test is not. Thus the two-sample permutation test may always be preferable. If blocking including matched pairs was used in the original design, then the same division into blocks should be employed in the analysis. Confounding factors such as sex, race, and diabetic condition can easily mask the effect we hoped to measure through the comparison of two samples. Similarly, an overall risk factor can be totally misleading (Gigerenzer, 2002). Blocking reduces the differences between subjects so that differences between treatment groups stand out, if, that is, the appro- priate analysis is used. Thus paired data should always be analyzed with the paired t-test or its permutation equivalent, not with the group t-test.- eBook - ePub
- Barry H. Cohen(Author)
- 2013(Publication Date)
- Wiley(Publisher)
6. The risk of rejecting the null hypothesis is that it may actually be true (even though your results look very good), in which case you are making a Type I error. The probability of making a Type I error when the null hypothesis is true is determined by the alpha level that you use (usually .05).7. If you make alpha smaller to reduce the proportion of Type I errors, you will increase the proportion of Type II errors, which occur whenever the null hypothesis is not true, but you fail to reject the null hypothesis because you are being cautious. The probability of making a Type II error is not easily determined. Type II errors will be discussed thoroughly in Chapter 8.8. One-tailed hypothesis tests make it easier to reach statistical significance in the predicted tail but rule out the possibility of testing results in the other tail. Because of the everpresent possibility of unexpected results, the two-tailed test is more generally accepted.Exercises
*1.2.a. If the calculated z for an experiment equals 1.35, what is the corresponding one-tailed p value? The two-tailed p value?b. Find the one- and two-tailed p values corresponding to z = −.7.c. Find one- and two-tailed p values for z = 2.2.3.a. If alpha were set to the unusual value of .08, what would be the magnitude of the critical z for a one-tailed test? What would be the values for a two-tailed test?b. Find the one- and two-tailed critical z values for α = .03.c. Find one- and two-tailed z values for α = .007.4.a. If the one-tailed p value for an experiment were .123, what would the value of z have to be?b. If the two-tailed p value for an experiment were .4532, what would the value of z have to be?a. As alpha is made smaller (e.g., .01 instead of .05), what happens to the size of the critical z?b. As the calculated z gets larger, what happens to the corresponding p value?*5. An English professor suspects that her current class of 36 students is unusually good at verbal skills. She looks up the verbal SAT score for each student and is pleased to find that the mean for the class is 540. Assuming that the general population of students has a mean verbal SAT score of 500 with a standard deviation of 100, what is the two-tailed p
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.










