Mathematics

Normal Distribution Hypothesis Test

A normal distribution hypothesis test is a statistical method used to determine if a given set of data follows a normal distribution. It involves comparing the sample data to a theoretical normal distribution to assess the likelihood that the sample was drawn from a population with a normal distribution. This test is commonly used in various fields to assess the validity of statistical assumptions.

Written by Perlego with AI-assistance

10 Key excerpts on "Normal Distribution Hypothesis Test"

  • Book cover image for: Statistical Methods
    eBook - PDF

    Statistical Methods

    An Introduction to Basic Statistical Concepts and Analysis

    • Cheryl Ann Willard(Author)
    • 2020(Publication Date)
    • Routledge
      (Publisher)
    115 HYPOTHESIS TESTING Now that we have examined the properties of the normal distribution and you have an understanding of how probability works, let us look at how it is used in hypothesis test-ing. Hypothesis testing is the procedure used in inferential statistics to estimate population parameters based on sample data. The procedure involves the use of statistical tests to deter-mine the likelihood of certain population outcomes. In this chapter, we will use the z -test, which requires that the population standard deviation ( σ ) be known. 8 Hypothesis Testing The material in this chapter provides the foundation for all other statistical tests that will be covered in this book. Thus, it would be a good idea to read through this chapter, work the problems, and then go over it again. This will give you a better grasp of the chapters to come. Tip! Hypothesis testing usually begins with a research question such as the following: Sample Research Question Suppose it is known that scores on a standardized test of reading comprehension for fourth graders is normally distributed with μ = 70 and σ = 10. A researcher wants to know if a new reading technique has an effect on comprehension. A random sample of n = 25 fourth grad-ers are taught the technique and then tested for reading comprehension. A sample mean of M = 75 is obtained. Does the sample mean ( M ) differ enough from the population mean ( μ ) to conclude that the reading technique made a difference in level of comprehension? Our sample mean is, obviously, larger than the population mean. However, we know that some variation of sample statistics is to be expected just because of sampling error. What we want to know further is if our obtained sample mean is different enough from the popula-tion mean to conclude that this difference was due to the new reading technique and not just to random sampling error.
  • Book cover image for: Fundamental Statistics for the Behavioral Sciences
    I did this deliberately to emphasize the point that the logic and the calculations behind a test are two separate issues. You now know quite a bit about how hypothesis tests are conducted, even if you may not have the slightest idea how to do the arithmetic. However, we now can use what you already know about the normal distribution to test some simple hypotheses. In the process we can deal with several fundamental issues that are more easily seen by use of a concrete example. An important use of the normal distribution is to test hypotheses, either about individual observations or about sample statistics such as the mean. Here we will deal with individual observations, leaving the question of testing sample statistics until later chapters. Note, however, that in the general case we test hypotheses about sam-ple statistics such as the mean rather than hypotheses about individual observations. I am starting with an example of an individual observation because the explanation is somewhat clearer. Because we are dealing with only single observations, the sampling distribution invoked here will be the distribution of individual scores (rather than the distribution of means). The basic logic is the same, and we are using an example of individual scores only because it simplifies the explanation and is something with which you have had experience. I should point out that in fields such as neurol-ogy and occasionally clinical psychology testing using individual scores is reasonably common. For example, neurologists often use simple measurements, such as two-point sensitivity 2 , to diagnose disorders. For example, someone might be classified as having a particular disorder if his or her two-point sensitivity or visual response time is significantly greater than that of normal responders. For a simple example assume that we are concerned with the rate at which people can tap their fingers.
  • Book cover image for: Statistics
    eBook - PDF

    Statistics

    Unlocking the Power of Data

    • Robin H. Lock, Patti Frazer Lock, Kari Lock Morgan, Eric F. Lock, Dennis F. Lock(Authors)
    • 2021(Publication Date)
    • Wiley
      (Publisher)
    Knowing just the mean and standard deviation of a normal distribution tells us what the entire distribution looks like. Normal Distribution A normal distribution follows a bell-shaped curve. We use the two parameters mean, , and standard deviation, , to distinguish one nor- mal curve from another. For shorthand we often use the notation N(, ) to specify that a distri- bution is normal (N) with some mean () and standard deviation (). 5.1 Hypothesis Tests Using Normal Distributions 403 (a) Different means –4 –2 0 2 N(0,1) N(2,1) 4 –4 –2 0 2 4 (b) Different standard deviations N(0,0.5) N(0,1) N(0,2) Figure 5.2 Comparing normal curves Figure 5.2 shows how the normal distribution changes as the mean  is shifted to move the curve horizontally or the standard deviation  is changed to stretch or shrink the curve. In a bootstrap distribution or randomization distribution, we are often inter- ested in finding the proportion of statistics to the right or left of a certain value, with the total proportion of all the statistics being 1. When we use a normal distribution, this corresponds to finding an area to the right or left of a certain value, given that the total area under the curve is 1. Many technology options exist for finding the proportion of a normal distribution that falls beyond a specified endpoint. 1 Example 5.1 Find the area to the right of 95 in a normal distribution with mean 80 and standard deviation 10. Solution There are many different methods that can be used to find this area, and this is a good time to get comfortable with the method that you will use. Figure 5.3 shows how we find the area using StatKey. We see that the area to the right of 95 in the N(80, 10) distribution is 0.067.
  • Book cover image for: Applied Statistics and Probability for Engineers
    • Douglas C. Montgomery, George C. Runger(Authors)
    • 2018(Publication Date)
    • Wiley
      (Publisher)
    208 CHAPTER 9 Tests of Hypotheses for a Single Sample 9.2 Tests on the Mean of a Normal Distribution, Variance Known In this section, we consider hypothesis testing about the mean μ of a single normal population where the variance of the population σ 2 is known. We assume that a random sample X 1 , X 2 , … , X n has been taken from the population. Based on our previous discussion, the sample mean X is an unbiased point estimator of μ with variance σ 2 /n. 9.2.1 Hypothesis Tests on the Mean Suppose that we wish to test the hypotheses H 0 ∶μ = μ 0 H 1 ∶μ ≠ μ 0 (9.7) where μ 0 is a specified constant. We have a random sample X 1 , X 2 , … , X n from a normal pop- ulation. Because X has a normal distribution (i.e., the sampling distribution of X is normal) with mean μ 0 and standard deviation σ∕ √ n if the null hypothesis is true, we could calculate a P-value or construct a critical region based on the computed value of the sample mean X, as in Section 9.1.2. It is usually more convenient to standardize the sample mean and use a test statistic based on the standard normal distribution. That is, the test procedure for H 0 : μ = μ 0 uses the test statistic: Test Statistic Z 0 = X − μ 0 σ∕ √ n (9.8) If the null hypothesis H 0 : μ = μ 0 is true, E( X) = μ 0 , and it follows that the distribution of Z 0 is the standard normal distribution [denoted N(0, 1)]. The hypothesis testing procedure is as follows. Take a random sample of size n and compute the value of the sample mean x. To test the null hypothesis using the P-value approach, we would find the probability of observing a value of the sample mean that is at least as extreme as x, given that the null hypothesis is true. The standard normal z-value that corresponds to x is found from the test statistic in Equation 9.8: z 0 = x − μ 0 σ∕ √ n In terms of the standard normal cumulative distribution function (CDF), the probability we are seeking is 1 − Φ(|z 0 |).
  • Book cover image for: Understanding Business Statistics
    • Ned Freed, Stacey Jones, Timothy Bergquist(Authors)
    • 2013(Publication Date)
    • Wiley
      (Publisher)
    In such cases the hypothesis test we set up is often referred to as a t test. In these t tests, we’ll simply replace the normal z c boundaries with t distribution t c equiv- alents and show the test statistic as t stat rather than z stat . For large enough sample sizes ( n  30), the normal approximation to the t distribution is, as we’ve seen before, perfectly acceptable. As in Chapter 7, in the small sample case ( n  30), an additional assumption is required in order to use our hypothesis testing procedure: The population values must be normally distributed. An Illustration To illustrate how the t test works, suppose that in our Montclair Motors example the sample size is 15 rather than 50, and the value of the population standard deviation is unknown. Suppose further that the sample mean, x , is 4922 pounds and the sample standard deviation, s, is 220. If we assume that the population is approximately normal, we can compute the test statistic, t stat , for 4922 using 338 C H A P T E R 9 Statistical Hypothesis Testing As you would expect, we can easily adapt to two-tailed tests. We’ll simply use /2 rather than  as the first row entry point in the t table to produce the upper and lower critical values for t. p-value Approach We can also use the p-value approach to conduct a t test. To do so, however, we’ll need the help of a statistical calculator or a statistical software package to produce the required p- value since our back-of-the-book t table isn’t extensive enough to do the job. To illustrate, we’ve used Excel’s T.DIST function to find the p-value for x  4922—the sample mean cited in our example above. The result is a p-value of .0961. (To produce this result, we entered an “ x” of 1.37, set degrees of freedom at 14, and entered “1” in the “cumulative” box. T.DIST then returned the .0961 “less than or equal to” probability.) Visually, .0961 is the area below 1.37 in the lower tail of a t-distribution with 14 degrees of freedom.
  • Book cover image for: Probability and Statistics
    eBook - PDF

    Probability and Statistics

    A Didactic Introduction

    • José I. Barragués, Adolfo Morais, Jenaro Guisasola, José I. Barragués, Adolfo Morais, Jenaro Guisasola(Authors)
    • 2016(Publication Date)
    • CRC Press
      (Publisher)
    We can clearly see that since the alternative hypothesis proposes a decrease, we are only interested in the lower end of this distribution, leading to a one-tail test. Figure 4 highlights both the rejection and the acceptance regions for this test; the critical value separating these regions will be calculated shortly. There are a number of misconceptions associated with the significance level of a test (Batanero 2004), and these will be discussed in due course. Fig ure 3. The distribution of the sample mean X – under the null hypothesis, including a visual interpretation of a one-tailed test at the 5% significance level. Tests of Hypotheses 261 Putting it All Together At this point we will slowly work through the above example, showing how each of the points discussed above leads to a calculation that allows us to make some inferences concerning out initial question. Some familiarity with respect to the handling of calculations involving the normal distribution is assumed here. Furthermore, we need to be aware of the fact that we are essentially calculating a conditional probability (this links in with concepts introduced in earlier chapters). There are essentially three standard approaches to carrying out the calculations in the hypothesis test, although they are all equivalent. Each of these will be considered in the current section. Possibly the most transparent and intuitive is the notion of a p -value. Using the p-value In our case this would be the probability that the sample mean is 52 or below, given that H 0 is true; recall that this is a one-tailed test. To recapitulate, the distribution of X – under H 0 is given by N N . Figure 4. A depiction of the acceptance and rejection regions at the 5% significance level. 262 Probability and Statistics Therefore, =0.0026, which is termed the p -value of the test outcome.
  • Book cover image for: Quantitative Techniques in Business, Management and Finance
    • Umeshkumar Dubey, D P Kothari, G K Awari(Authors)
    • 2016(Publication Date)
    3. It indicates the number of values that are free to vary. 4. The procedure to calculate the degrees of freedom varies from test to test. 9.2.14 Test of Significance Parametric test: T-test, z-test, analysis of variance (ANOVA) Non-parametric test: Chi-square, median test, McNemar, Mann–Whitney, Wilcoxon, Fisher’s exact 214 Quantitative Techniques in Business, Management and Finance 9.2.15 Parametric Test 1. It is known as a normal distribution statistical test. 2. Statistical methods of inference make certain assumptions about the populations from which the samples are drawn. For example, populations are normally dis-tributed, have the same variance. 9.2.16 Non-Parametric Tests 1. A normality assumption is not required. 2. Ordinal or interval scale data is used. 3. It can be applied for small samples. 4. Samples use the sign test. 5. Independent samples use Mann–Whitney U-statistics. 6. Randomness uses run tests. 7. Several independent samples use the Kruskal–Wallis test. 9.3 Probability Distributions 9.3.1 Binomial Distribution It was discovered by Jacob Bernoulli, a Swiss mathematician of the seventeenth century. 9.3.1.1 Assumption The probability of the outcome remains constant over time. 9.3.1.2 Bernoulli Variable This is a random variable x which assumes the values of 1 and 0 with respective probabili-ties p and q = 1 – p. The Bernoulli distribution is x 1 0 p(x) p q 9.3.1.3 Random Variable This is a quantity obtained from an experiment that may by chance result in different val-ues. A random variable is a function which has its domain confined to a sample space and its range confined to a real space; such restricted functions are called random variables. It is a variable which assumes different numerical values as a result of a random experiment or random occurrences. Note: 1. Every function has a domain and range. 2. The domain must be a real number. 215 Hypothesis Testing 3. A random variable must assume numerical values.
  • Book cover image for: Stochastic Modeling and Mathematical Statistics
    eBook - PDF

    Stochastic Modeling and Mathematical Statistics

    A Text for Statisticians and Quantitative Scientists

    Among the guidelines that the above critique of testing suggest are (i) specify your choice of null and alternative hypotheses before your data becomes available, and defend the choice as the natural and/or traditional specification when this is possible, (ii) don’t test simple null hypotheses, (iii) avoid taking the acceptance of a null hypothesis as evidence that H 0 is true, but follow up such a test outcome with further tests aimed at investigating whether there are alternative hypotheses that are also supported by the data. The conse-quences of accepting a particular null hypothesis as true should be carefully studied. In particular, when several models remain competitive after multiple goodness-of-fit tests, the consequences of each of the potential modeling assumptions on the estimation of pa-rameters of interest (such as the population mean and variance, for example) should be investigated and compared. Criticisms like (3) and (5) above are not easily addressed. I will not say more about the file drawer problem, as there is no simple solution to this problem, and it is mentioned here as food for thought rather than as a call to action. Regarding (3), it should be noted that the development of a confidence interval for a parameter of interest can often shed light on the questions of practical and statistical significance. Before elaborating on this point, let me mention a simple and direct connection between hypothesis tests and confidence intervals. (This connection was first alluded to in Example 8.1.5.) The connection is easiest to under-stand in the context of pinpoint hypothesis testing, so I will discuss that particular case in detail. The pertinent ideas can be easily extended to more general tests. Let’s suppose that we are interested in testing hypotheses of the form H 0 : θ = θ 0 vs. H 1 : θ 6 = θ 0 at α = . 05 based on the test statistic T , a function of a random sample from a one-parameter family of distributions { F θ } .
  • Book cover image for: Probability and Statistics with R
    • Maria Dolores Ugarte, Ana F. Militino, Alan T. Arnholt(Authors)
    • 2015(Publication Date)
    The interval calculated agrees with our conclusion from step 5 because it contains values that are exclusively greater than zero. 9.8 Hypothesis Tests for Population Variances The hypothesis tests for a single variance and a ratio of variances are based on the same pivots as those used for the related confidence intervals. An underlying normal population is required for the hypothesis tests described in this section to give accurate results. If this assumption is violated, nonparametric methods should be used to come to reasonable conclusions based on the data. 9.8.1 Test for the Population Variance When Sampling from a Normal Distribution The tests for population means presented up to this point have assumed the sampling distributions for their corresponding statistics follow a normal distribution; however, the tests for means are fairly robust to violations in normality assumptions. In contrast, the normality assumption for testing a hypothesis about variance is not robust to departures from normality. Consequently, one should proceed with caution when testing a hypothesis about the variance especially since non-normality is di ffi cult to detect when working with small to moderate size samples. As a minimum, one should look at a normal quantile-quantile plot to make sure normality is plausible before testing a hypothesis concerning the population variance. Provided X 1 , X 2 , . . . , X n is a random sample from a N ( μ, σ ) distribution, the random variable ( n -1) S 2 σ 2 ⇠ χ 2 n -1 . 556 Probability and Statistics with R , Second Edition The null hypothesis for testing the population variance is H 0 : σ 2 = σ 2 0 , and the value for the test statistic is χ 2 obs = ( n -1) s 2 σ 2 0 . The three alternative hypotheses and the rejection regions for H 0 are in Table 9.12.
  • Book cover image for: Fundamentals of Biostatistics
    Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 228 C H A P T E R 7 Hypothesis Testing: One-Sample Inference statistical inference. Beginning at the “Start” box, we arrive at the one-sample t test box by answering yes to each of the following four questions: (1) one variable of interest? (2) one-sample problem? (3) underlying distribution normal or can central-limit theorem be assumed to hold? and (4) inference concerning µ ? and no to ques-tion (5) σ known? One-Sample z Test In Equations 7.10 and 7.11 (pp. 224–225), the critical values and p -values for the one-sample t test have been specified in terms of percentiles of the t distribution, assuming the underlying variance is unknown. In some applications, the variance may be assumed known from prior studies. In this case, the test statistic t can be replaced by the test statistic z x n = -( ) ( ) μ σ 0 . Also, the critical values based on the t distribution can be replaced by the corresponding critical values of a standard normal distribution. This leads to the following test procedure: EQUATION 7.13 One-Sample z Test for the Mean of a Normal Distribution with Known Variance (Two-Sided Alternative) To test the hypothesis H H 0 0 1 0 : : μ μ μ μ = ≠ vs. with a significance level of α , where the underlying standard deviation σ is known, the best test is based on z x n = -( ) ( ) μ σ 0 If z z z z < -α α 2 1 2 or > then H 0 is rejected. If z z z α α 2 1 2 ≤ ≤ -then H 0 is accepted. To compute a two-sided p -value, we have p z z z z = ≤ = -[ ] > 2 0 2 1 0 Φ Φ ( ) ( ) if if EXAMPLE 7.25 Cardiovascular Disease Consider the cholesterol data in Example 7.21. Assume that the standard deviation is known to be 40 and the sample size is 200 instead of 100.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.