Mathematics

Quartiles and Interquartile Range

Quartiles are values that divide a data set into four equal parts. The interquartile range is the difference between the third quartile (Q3) and the first quartile (Q1), representing the middle 50% of the data. These measures are used to understand the spread and distribution of a data set, particularly in statistics and data analysis.

Written by Perlego with AI-assistance

10 Key excerpts on "Quartiles and Interquartile Range"

  • Book cover image for: Handbook of Statistics for Teaching and Research in Plant and Crop Science
    • Usha Palaniswamy(Author)
    • 2005(Publication Date)
    • CRC Press
      (Publisher)
    Inter Quartile Range The inter quartile range is the difference between the third quartile (Q3) and I st quartile (Q|) values, i.e., Q} -Q, =85.5-71.0 = 14.5 (4.15) It contains 50 percent of the set with 25 percent falling above and 25 percent below the range. We have the central 50 percent of the set of observations, which are free from extreme values. Quartile Deviation Quartile deviation is the average of inter quartile range (IQR), i.e., (2, -Q,) / 2 = (85.5 -71) / 2; 14.5 / 2 = 7.25. The IQR is the difference between the 75th and 25th percentiles. A large IQR value indicates highly variable data. It is explained diagrammatically in the following representation and in Figure 4.7. Representation of parti-tion values: -----inter quartile range --------i Lowest 1st quartile 2nd quartile 3rd quartile Highest value (Qi) (Qz) (Qa) value (median) 65 Measures of Central Tendency 3N u Ji 2 N U 1st 2nd 3rd Quartile Quartile Quartile Q 1 M Q3 Values of the variate FIGURE 4.7. Graphical representation of the partition values. Grouped Data Usually, we collect many observations and it is advisable to calculate the quartiles from the grouped data instead of raw data. Formulas for comput-ing the three quartile measures are: / X /; i= 1.2,3 . . . x /( / x h £= 1,2.3,... 99 / 66 Handbook of Statistics for Teaching and Research in Plant and Crop Science where ] = lower limit of the class in which the particular quartile lies, / = frequency of this class, h = width of the class interval, C = cumulative frequency preceding the class, and N = total frequency. Example The number of hours spent in a hospital for treatment by 70 patients are listed below in a frequency table. H o u r s N o . o f p a t ie n t s H o u r s N o . o f p a t ie n t s 1-3 6 9-11 21 3-5 53 11-13 16 5-7 85 13-15 4 7-9 56 15-17 4 A cum ulative frequency table is form ed as follow s: C l a s s f c .
  • Book cover image for: Essentials of Business Statistics
    • Ken Black, Ignacio Castillo, Amy Goldlist, Timothy Edmunds(Authors)
    • 2018(Publication Date)
    • Wiley
      (Publisher)
    Un- less the data set was deliberately constructed by a trickster, it’s unlikely that the middle 50% of the data points are actually much more tightly clustered than the interquartile range suggests. The interquartile range is especially useful in situations where data users are more inter- ested in values toward the middle and less interested in extremes. In describing a real estate housing market, real estate agents might use the interquartile range as a measure of housing prices when describing the market for buyers who are interested in houses in the mid-range. In addition, the interquartile range is used in the construction of box and whisker plots. Interquartile Range IQR = − Q Q 3 1 (3.4) The data in Table 3.4 indicate Canada’s top 10 trading partners by Canadian exports to the country in a recent year according to Innovation, Science and Economic Development Canada. TABLE 3.4 Canada’s Top-10 Trading Partners Country Exports ($ billions) United States 338.7 China 19.3 United Kingdom 18.8 Japan 10.4 Mexico 5.4 Netherlands 4.5 South Korea 3.7 Germany 3.6 France (incl. Monaco, French Antilles) 3.2 Brazil 2.6 74 CHAPTER 3 Descriptive Statistics What is the interquartile range for these data? The process begins by computing the first and third quartiles as follows. Solving for Q 1 = P 25 when n = 10, k 25 100 10 2.5 = × = P 25 is found as the third value from the bottom: Q P 3.6 ($ billions) 1 25 = = Solving for Q 3 = P 75 , k 75 100 10 7.5 = × = P 75 is found as the eighth value from the bottom: Q P 18.8 ($ billions) 3 75 = = The interquartile range is − = − = Q Q 18.8 3.6 15.2 3 1 The middle 50% of the exports for the top 10 Canadian trading partners spans a range of 15.2 ($ billions).
  • Book cover image for: Precalculus with Limits
    890 Chapter 13 Concepts in Statistics GO DIGITAL Box-and-Whisker Plots Standard deviation is the measure of dispersion that is associated with the mean. Quartiles measure dispersion associated with the median. EXAMPLE 7 Finding Quartiles of a Data Set Find the lower and upper quartiles for the data set below. 42, 14, 24, 16, 12, 18, 20, 24, 16, 26, 13, 27 Solution Begin by ordering the data. 12, 13, 14, 16, 16, 18, 20, 24, 24, 26, 27, 42 1st 25% 2nd 25% 3rd 25% 4th 25% The median of the entire data set is (18 + 20)2 = 19. The median of the six numbers less than 19 is (14 + 16)2 = 15. So, the lower quartile is 15. The median of the six numbers greater than 19 is (24 + 26)2 = 25. So, the upper quartile is 25. Checkpoint Audio-video solution in English & Spanish at LarsonPrecalculus.com Find the lower and upper quartiles for the data set below. 39, 47, 81, 43, 23, 23, 27, 86, 15, 3, 74, 55 The interquartile range of a data set is the difference of the upper quartile and the lower quartile. The interquartile range of the data set in Example 7 is 25 - 15 = 10. A value that is widely separated from the rest of the data in a data set is called an outlier. Typically, a data value is considered to be an outlier when it is greater than the upper quartile by more than 1.5 times the interquartile range or when it is less than the lower quartile by more than 1.5 times the interquartile range. Verify that, in Example 7, the value 42 is an outlier. Quartiles are represented graphically by a box-and-whisker plot, as shown in the figure below. In the plot, notice that five numbers are listed: the least number, the lower quartile, the median, the upper quartile, and the greatest number. These numbers are the five-number summary of the data set. Also notice that the numbers are spaced proportionally, as though they were on a real number line. 12 15 19 25 42 The next example shows how to find quartiles when the number of elements in a data set is not divisible by 4.
  • Book cover image for: Understanding Business Statistics
    • Ned Freed, Stacey Jones, Timothy Bergquist(Authors)
    • 2013(Publication Date)
    • Wiley
      (Publisher)
    On the negative side, the interquartile range shares the same limitation as the range in measuring dispersion—many (most) of the values in the data set play no direct role in its calculation. 16. The number of fabricated parts that didn’t pass inspection was tracked for 25 recent shifts at Cornwall Manufacturing. The data are shown in the table below. Determine Q1, Q2 and Q3 for the data. Rejected Parts 122 123 21 116 113 85 96 83 109 90 148 113 102 114 77 115 67 111 86 88 78 94 83 79 91 Interquartile Range (3.1) IQR  Q3  Q1 Interquartile Range Below is the list of the ages (in years) of the 22 trucks currently operated by Amalgamated Van Lines. (The same data set was used in Demonstration Exercise 3.2.) The data are arranged in ascending order and the 1 st and 3 rd quartiles are marked. Determine and inter- pret the interquartile range. 2.5, 2.7, 3.5, 3.6, 4.5, 4.8, 5.2, 5.5, 5.5, 6.0, 6.2, 6.8, 7.8, 8.9, 10.5, 12.1, 13.0, 14.2, 15.6 15.7, 18.2, 19.7 IQR  (13.0  4.8)  8.2 Q1 Q3 Solution: The interquartile range, Q3  Q1, is 13.0  4.8  8.2 years. This indicates that the middle 50% of the truck ages spans 8.2 years. DEMONSTRATION EXERCISE 3.3 17. Determine and interpret the interquartile range for the data in Exercise 8 showing months of useful cell phone life. The unordered list is reproduced below: 12.7, 23.5, 26.8, 15.5, 13.6, 28.9, 14.5, 36.0, 27.8, 25.2, 35.7, 62.5, 16.2, 35.5, 28.2, 15.6, 32.7, 22.1 18. The following table gives the rate of unemploy- ment for the US civilian population for the years 1990 to 2010 (source: Bureau of Labor Statistics, October 2011). EXERCISES 4. 8 Q1 3. 0 Q3 1990 1991 1992 1993 1994 1995 5.6 6.8 7.5 6.9 6.1 5.6 1996 1997 1998 1999 2000 2001 5.4 4.9 4.5 4.2 4.0 4.7 2002 2003 2004 2005 2006 2007 5.8 6.0 5.5 5.1 4.6 4.6 2008 2009 2010 5.8 9.3 9.6 Determine and interpret the interquartile range for the data. 19. Physical fitness test scores for 26 college sophomores are shown below.
  • Book cover image for: Mathematics NQF3 SB
    eBook - PDF
    • M Van Rensburg, I Mapaling, M Trollope A Thorne(Authors)
    • 2017(Publication Date)
    • Macmillan
      (Publisher)
    The value of Q 2 will be greater than half of the values in the set (50%). 293 Module 11 Measure Explanation and formula Upper quartile (Q 3 ) This is the median of the upper half of the data. Its position is indicated by the formula: Q 3 ’s position = P Q 3 = 3 __ 4 ( n + 1). The value of Q 3 will be greater than 3 __ 4 of all of the values in the set (75%). Inter-quartile range (IQR) The inter-quartile range tells you how spread out the middle values of the data set are. You can calculate the inter-quartile range by finding the difference between Q 1 and Q 3 : IQR = Q 3 − Q 1 Five-number summary The five-number summary uses five essential values to give a complete description of the data under consideration. The five values are: 1. the minimum value: the smallest value in the data set 2. the lower quartile (Q 1 ): the median of the lower half of the values 3. the median (Q 2 ): the value that divides the data into half 4. the upper quartile (Q 3 ): the median of the upper half of the values 5. the maximum value: the largest value in the data set. Example 11.3 For the following data set: 8; 13; 6; 10; 11; 16; 14 find the: 1. median (Q 2 ) 2. lower quartile (Q 1 ) 3. upper quartile (Q 3 ). Solutions: First arrange the data in ascending order: 6; 8; 10; 11; 13; 14; 16. 1. Q 2 ’s position = 1 __ 2 ( n + 1) = 1 __ 2 (7 + 1) = 4 ∴ Q 2 = 11 (the 4 th position) 3. Q 3 ’s position = 3 __ 4 ( n + 1) = 3 __ 4 (7 + 1) = 6 ∴ Q 3 = 14 (the 6 th position) 2. Q 1 ’s position = 1 __ 4 ( n + 1) = 1 __ 4 (7 + 1) = 2 ∴ Q 1 = 8 (the 2 nd position) 294 Module 11 Example 11.4 A class of 15 learners obtained the following marks out of 50 for a Mathematics test: 12; 31; 42; 8; 25; 30; 32; 25; 16; 36; 35; 22; 18; 24; 22 Calculate the: 1. range 2. lower quartile 3. upper quartile. Solutions: First arrange the marks in ascending order: 8; 12; 16; 18; 22; 22; 24; 25; 25; 30; 31; 32; 35; 36; 42 1.
  • Book cover image for: Statistical Methods for Communication Science
    • Andrew F. Hayes(Author)
    • 2020(Publication Date)
    • Routledge
      (Publisher)
    A box plot contains information about the median of a distribution, the interquartile range (IQR), the measurement interval that contains the inner 50% of measurements, and the minimum and maximum mea-surements in a distribution, while at the same time highlighting measurements that extremes TV Viewing Hours 12 11 10 9 8 7 6 5 4 3 2 1 0 Median Median + 1.5 IQR (or maximum, whichever is smallest) Median - 1.5 IQR Inner 50% outliers (or minimum, whichever -1 is largest) -2 61 4.6. Standardization Figure 4.4 A box plot of the TV viewing data. are unusual using certain criteria. A box plot of the TV viewing data is displayed in Figure 4.4. The figure itself is fairly self-explanatory. The dark line dividing the gray box is the median, while the upper and lower edges of the box define the end points of the ordinal middle 50% of the measurements. From the box plot, you can see that the median measurement is 2, whereas 50% of the measurements reside between 1.5 and 3. By definition, then, the interquartile range is 3 − 1 . 5 = 1 . 5. The long horizontal lines above and below the box are set at the median plus and minus 1.5 interquartile ranges. However, if the median plus 1.5 IQRs exceeds the maximum measurement, then the upper line is placed at the maximum. If the median minus 1.5 IQRs is smaller than the minimum measurement, then the lower line is set at the minimum. The box plot also depicts the “unusual” measurements, defined as those with measurements that are more than 1.5 IQRs from the median (in either direction). Different statistical programs will depict unusual cases differently. In SPSS (which generated this figure), “outliers” in a box plot are defined as cases with measurements between 1.5 and 3 IQRs from the median. “Extreme values” are defined by SPSS as measurements more than three IQRs from the median.
  • Book cover image for: Fundamental Statistics for the Behavioral Sciences
    Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 5.13 Summary 103 5.13 Summary In the last three chapters we have learned how to plot data, how to calculate sensible measures of the center of the distribution, and how to calculate measures of disper-sion. In the rest of the book you will learn more about plotting data, but you basically have the information you will need about descriptive statistics, such as the mean and standard deviation. We began with the simplest measure of dispersion, the range, which is just the difference between the lowest and highest scores. We built on the range to cover the interquartile range, which is the range after you remove the lowest and highest 25% of the scores. The interquartile range can also be thought of as the range of a 25% trimmed sample, because a 25% trimmed sample would discard the lowest and highest 25%. In fact, there is nothing uniquely important about the interquartile range, and we can take the range of a sample that has been trimmed of any specific percentage of cases. The most commonly used measures of dispersion are the sample variance (s 2 ) and the sample standard deviation (s). The variance is essentially the average of the squared distances between each observation and the sample mean. To compute the variance, we “average” by dividing by one less than the sample size (i.e., N – 1), rather than by the full sample size. This gives us an unbiased estimate of the population vari-ance ( s 2 ). The sample standard deviation (s) is simply the square root of the variance. An unbiased estimator is one whose long-term average is equal to the population pa-rameter that it is intended to estimate.
  • Book cover image for: Statistics and Probability for Engineering Applications
    • William DeCoursey(Author)
    • 2003(Publication Date)
    • Newnes
      (Publisher)
    However, it is not a good indicator if the sample size varies, because the sample range tends to increase with increasing sample size. Its other major drawback is that it depends on only two items in each sample, the smallest and the largest, so it does not make use of all the data. This disadvantage becomes more serious as the sample size increases. Because of its simplicity, the sample range is used frequently in quality control when the sample size is constant; simplicity is particularly desirable in this case so that people do not need much education to apply the test. (b) Interquartile Range The interquartile range is the difference between the upper quartile and the lower quartile, which will be described in section 3.3. It is used fairly frequently as a measure of variability, particularly in the Box Plot, which will be described in the next chapter. It is used less than some alternatives because it is not related to any of the important theoretical distributions. (c) Mean Deviation from the Mean The mean deviation from the mean , defined as ( ) 1 / N i i x x N = − ∑ , where / i x x N = ∑ , is useless because it is always zero. This follows from the discussion of the sum of deviations from the mean in section 3.1 (a). (d) Mean Absolute Deviation from the Mean However, the mean absolute deviation from the mean , defined as 1 / N i i x x N = − ∑ is used frequently by engineers to show the variability of their data, although it is usually not the best choice. Its advantage is that it is simpler to calculate than the main alternative, the standard deviation, which will be discussed below. For Groups A, B, and C the mean absolute deviation is as follows: Group A: (2.25 + 1.25 + 0.25 + 3.75)/4 = 7.5/4 = 1.875. Group B: (3.25 + 2.25 + 0.25 + 5.75)/4 = 11.5/4 = 2.875. Group C: (4.25 + 3.25 + 0.75 + 6.75)/4 = 15/4 = 3.75. Its disadvantage is that it is not simply related to the parameters of theoretical distributions.
  • Book cover image for: Workshop Statistics
    eBook - PDF

    Workshop Statistics

    Discovery with Data

    • Allan J. Rossman, Beth L. Chance(Authors)
    • 2012(Publication Date)
    • Wiley
      (Publisher)
    You observed that range is not a very useful measure, because it considers only the extreme values in a dataset, and you learned that the standard deviation is not resistant to outliers. You also explored the empirical rule and z-scores as applications of the standard deviation. These tools are particularly useful for enabling you to compare the proverbial apples and oranges, such as scores on SAT and ACT exams or heights of men and women, by converting measurements onto a common scale calculated in number of standard deviations away from the mean. The z-score serves as a “ruler” for measuring distances. 188 Topic 9: Measures of Spread In Brief Some useful definitions to remember and habits to develop from this topic are • Variability is a property of a distribution; standard deviation and interquartile range are two ways to measure variability for a particular dataset. (Similarly, as you saw in the previous topic, center, another property of distributions, can be measured by the mean and median.) • In describing the variability in a dataset, focus on the bulk of the data and not on a few extreme values. Also remember that bumpiness and variety are not the same as variability. • Standard deviation can be loosely interpreted as the typical deviation of an observation from the mean. Standard deviation is more cumbersome to calculate by hand but is more widely used than the mean absolute deviation. • The interquartile range is the difference between the upper quartile and lower quartile, so the interquartile range is the width of the middle 50% of the ordered values in a dataset and is reported as a single number. • The mean and standard deviation provide a useful summary for symmetric distributions. The median and IQR are always appropriate measures, but especially with skewed distributions.
  • Book cover image for: Essential Statistics for Economics, Business and Management
    • Teresa Bradley(Author)
    • 2014(Publication Date)
    • Wiley
      (Publisher)
    (b) Estimate each quartile approximately Solution (a) The calculations based on the data in Table SK1 5.2 are shown in Table 2.5. The lower quartile is the value of the item that is positioned 25 % of the way through the ordered data, i.e. at or nearest to 0.25(49 + 1)th = 12.5th item in the ordered data. From the cumulative frequency column in Table 2.5 the 12.5th item is in the interval (20 < 25). Since there are seven items in the first interval a further 5.5 items are required to reach the lower quartile value: this would be just less than half of the 12 items in this group, hence the value of Q 1 is 22.5 approximately See Table 2.5. The middle quartile or median is the value positioned half way through the ordered data, i.e. at or nearest the item in the 0.5(49 + 1)th = 25th position. The 25th item is in interval (30 < 35). Since there is only one item in this group, use the mid-interval value as an estimate of the median, so 32.5, as indicated in Table 2.5. The upper quartile is the value positioned 75 % through the ordered data, that is at or nearest to the 0.75(49 + 1)th = 37.5th item. This item is in the interval (40 < 45). To reach the 37.5th item requires the first 3.5 out of the 10 items in this group – approximately 1/3 of the way into the interval (40 < 45). Hence the value of the upper quartile ( Q 3 ) is 42 approximately (see Table 2.5). Table 2.5 Identifying the intervals containing the median and quartiles Intervals (hours) Frequency: f i Less than Cumulative frequency 15 < 20 7 < 20 7 20 < 25 12 < 25 19 25 < 30 5 < 30 24 30 < 35 1 < 35 25 35 < 40 9 < 40 34 40 < 45 10 < 45 44 45 < 50 1 < 50 45 50 < 55 1 < 55 46 55 < 60 3 < 60 49 Q 1 the 12.5th item is in this interval: The value of Q 1 is 22.5 approximately Q 2 (median), the 25th item is in this interval: The value of Q 2 is 32.5 approximately Q 3 the 37.5th item is in this interval; the value of Q 3 is 42 approximately Compare the results for the raw data calculated in Worked Example 2.1.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.