Mathematics

Stem and Leaf Graph

A stem and leaf plot is a way of organizing and displaying data. The stem represents the first digit or digits of the data, while the leaf represents the last digit. This type of graph allows for quick visualization of the distribution of the data and is particularly useful for small to moderately sized data sets.

Written by Perlego with AI-assistance

11 Key excerpts on "Stem and Leaf Graph"

  • Book cover image for: Quantitative Geography
    eBook - ePub

    Quantitative Geography

    Perspectives on Spatial Data Analysis

    • A Stewart Fotheringham, Chris Brunsdon, Martin Charlton(Authors)
    • 2000(Publication Date)
    Table 4.2 are shown in stem and leaf format.
    Table 4.2   Percentage of professional or managerial heads of households by county
    Source : 1981 UK Census of Population
     
    Note that the leaves are usually shown without spaces between them, so that the row labelled 21 represents the values 21.3 and 21.5. Stem and leaf plots are best shown in a fixed width font, such as Courier, as the length of the leaf display will then be proportional to the number of data items in that row. In this way, a stem and leaf plot gives a near-graphical representation of data distribution while still containing information about each data value within the accuracy of the two digits spanned rule.
    A stem and leaf display can be made more compact by combining rows. In the right hand side of Table 4.3 the same information is given, but with only the even numbered stems. Data relating to odd-numbered stems are shown in italics. The result has removed some of the rough character of the left hand stem and leaf plot. The shape of the distribution, with a large mass of observations between 12% and 20% and a positive tail, becomes apparent.
    Table 4.3   Stem and leaf plot of PROFMAN: left hand is standard form, right hand is compact form

    4.3 Boxplots

    A useful set of descriptive statistics is the five-number summary . This uses order-based statistics , the median, quartiles and extreme values. The median of a variable is simply the middle value if the variable is tabulated in ascending order and n is odd. If n is even it is the midpoint of the two middle values. Thus, it is effectively the ‘half-way point’ of the distribution. Again, this represents a ‘typical value’ of the variable. The quartiles are defined similarly, being the points one-quarter and three-quarters along the sorted list of a variable. A useful measure of spread is the interquartile range , the difference between the first and third quartiles. The extreme values are just the largest and smallest values of the variable in the data set. The five-number summary is the list (minimum value, first quartile, median, third quartile, maximum). Listing these numbers gives a good impression of the location, spread and extreme values of a data set. The values for PROFMAN are listed in Table 4.4
  • Book cover image for: Introductory Statistics
    • Barbara Illowsky, Susan Dean(Authors)
    • 2016(Publication Date)
    • Openstax
      (Publisher)
    CHAPTER REVIEW 2.1 Stem-and-Leaf Graphs (Stemplots), Line Graphs, and Bar Graphs A stem-and-leaf plot is a way to plot data and look at the distribution. In a stem-and-leaf plot, all data values within a class are visible. The advantage in a stem-and-leaf plot is that all values are listed, unlike a histogram, which gives classes of data values. A line graph is often used to represent a set of data values in which a quantity varies with time. These graphs are useful for finding trends. That is, finding a general pattern in data sets including temperature, sales, employment, company profit or cost over a period of time. A bar graph is a chart that uses either horizontal or vertical bars to show comparisons among categories. One axis of the chart shows the specific categories being compared, and the other axis represents a discrete value. Some bar graphs present bars clustered in groups of more than one (grouped bar graphs), and others show the bars divided into subparts to show cumulative effect (stacked bar graphs). Bar graphs are especially useful when categorical data is being used. 2.2 Histograms, Frequency Polygons, and Time Series Graphs A histogram is a graphic version of a frequency distribution. The graph consists of bars of equal width drawn adjacent to each other. The horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies. The heights of the bars correspond to frequency values. Histograms are typically used for large, continuous, quantitative data sets. A frequency polygon can also be used when graphing large data sets with data points that repeat. The data usually goes on y-axis with the frequency being graphed on the x-axis. Time series graphs can be helpful when looking at large amounts of data for one variable over a period of time. 2.3 Measures of the Location of the Data The values that divide a rank-ordered set of data into 100 equal parts are called percentiles.
  • Book cover image for: Introduction to Probability and Statistics Metric Edition
    • William Mendenhall, Robert Beaver, Barbara Beaver, , William Mendenhall, Robert Beaver, Barbara Beaver(Authors)
    • 2019(Publication Date)
    If you indicate that the leaf unit is 1, the reader will realize that the stem 6 and the leaf 8, for example, represent the number 68, recorded to the nearest dollar. Figure 1.9 Dotplots for small and large data sets 2 3 4 5 6 7 8 9 Small Set (a) 0.98 1.05 1.12 1.19 1.26 1.33 1.40 1.47 Large Set (b) Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 1.3 Graphs for Quantitative Data 21 Sometimes the available stem choices result in a plot that contains too few stems and a large number of leaves within each stem. In this situation, you can stretch the stems by dividing each one into several lines, depending on the leaf values assigned to them. Stems are usually divided in one of two ways: • Into two lines, with leaves 0–4 in the first line and leaves 5–9 in the second line • Into five lines, with leaves 0–1, 2–3, 4–5, 6–7, and 8–9 in the five lines, respectively ● ● Need a Tip? Stem | Leaf E X A M P L E 1.8 The data in Table 1.8 are the weights at birth of 30 full-term babies, born at a metropolitan hospital and recorded to the nearest tenth of a pound. 7 Construct a stem and leaf plot to display the distribution of the data. Solution The data, though recorded to an accuracy of only one decimal place, are measurements of the continuous variable  x weight , which can take on any positive value. If you scan the data in Table 1.8, you will find that the highest and lowest weights are 9.4 and 5.6, respectively.
  • Book cover image for: Introductory Statistics
    • Prem S. Mann(Author)
    • 2020(Publication Date)
    • Wiley
      (Publisher)
    FIGURE 2.17 Ranked stem-and-leaf display of test scores. 5 0 2 7 6 1 4 5 8 9 7 1 1 2 2 5 6 7 9 9 8 0 1 3 4 6 7 7 9 2 2 3 5 6 8 EXAMPLE 2.9 Monthly Rents Paid by Households As already mentioned, one advantage of a stem-and-leaf display is that we do not lose information on individual observations. We can rewrite the individual scores of the 30 college students from the stem-and-leaf display of Figure 2.16 or Figure 2.17. By contrast, the infor- mation on individual observations is lost when data are grouped into a frequency table. 66 CHAPTER 2 Organizing and Graphing Data Preparing a Grouped Stem-and-Leaf Display The following stem-and-leaf display is prepared for the number of hours that 25 students spent working on computers during the past month. 0 6 1 1 7 9 2 2 6 3 2 4 7 8 4 1 5 6 9 9 5 3 6 8 6 2 4 4 5 7 7 8 5 6 Prepare a new stem-and-leaf display by grouping the stems. Solution To condense the given stem-and-leaf display, we can combine the first three rows, the middle three rows, and the last three rows, thus getting the stems 0–2, 3–5, and 6–8. The leaves for EXAMPLE 2.10 Number of Hours Spent Working on Computers by Students Mark Harmel/Getty Images Sometimes a data set may contain too many stems, with each stem containing only a few leaves. In such cases, we may want to condense the stem-and-leaf display by grouping the stems. Example 2.10 describes this procedure. Solution Each of the values in the data set contains either three or four digits. We will take the first digit for three-digit numbers and the first two digits for four-digit numbers as stems. Then we will use the last two digits of each number as a leaf. Thus, for the first value, which is 880, the stem is 8 and the leaf is 80. The stems for the entire data set are 6, 7, 8, 9, 10, 11, 12, and 13. They are recorded on the left side of the vertical line in Figure 2.18. The leaves for the numbers are recorded on the right side. Best practices for a stem-and-leaf plot is always to rank the leaves.
  • Book cover image for: An Essential Guide to Business Statistics
    • Dawn A. Willoughby(Author)
    • 2016(Publication Date)
    • Wiley
      (Publisher)
    Brazil Cameroon China Colombia Costa Rica Bangladesh Papua New Guinea Philippines Burundi Ecuador Egypt India Guatemala Honduras Indonesia Mexico Other Panama Thailand Uganda Vietnam Production of bananas by country Stem and Leaf Diagrams A technique for organising quantitative data is to draw a stem and leaf diagram. We can use this type of diagram to observe patterns in our data set and highlight interesting features. Each individual data value is retained when we draw the stem and leaf diagram, but for large data sets it can be a time-consuming process. To draw the diagram, we need to separate the digits in each data value into a stem and a leaf. The stem consists of all but the rightmost digits and the leaf is the rightmost digit only. For single digits, use 0 as the stem. 90 A N E S S E N T I A L G U I D E T O B U S I N E S S S T A T I S T I C S Data value Stem Leaf 36 3 6 417 41 7 9 0 9 each leaf should consist of the single rightmost digit only There are three steps involved in drawing the diagram for a data set. 1. Write the stem digits in ascending order in a vertical column, drawing a vertical line to the right of the column. 2. For each stem, write all of the corresponding leaves in the data set along the same horizontal line. 3. Rearrange each line of leaves in ascending order. Within the range of the data values, all stems should be shown on the diagram even if the data set does not contain any corresponding leaves. Every stem and leaf diagram should have a key to provide information about how the data in the diagram should be interpreted. The data set in the following example consists of the scores achieved in an aptitude test taken by applicants for a graduate marketing role in a leading food and drink manufacturing company.
  • Book cover image for: Applied Statistics and Probability for Engineers
    • Douglas C. Montgomery, George C. Runger(Authors)
    • 2018(Publication Date)
    • Wiley
      (Publisher)
    To construct a stem-and-leaf diagram, use the following steps. Steps to Construct a Stem-and-Leaf Diagram (1) Divide each number x i into two parts: a stem, consisting of one or more of the leading digits, and a leaf, consisting of the remaining digit. (2) List the stem values in a vertical column. (3) Record the leaf for each observation beside its stem. (4) Write the units for stems and leaves on the display. To illustrate, if the data consist of percent defective information between 0 and 100 on lots of semiconductor wafers, we can divide the value 76 into the stem 7 and the leaf 6. In general, we should choose relatively few stems in comparison with the number of observations. It is usually best to choose between 5 and 20 stems. EXAMPLE 6.4 Alloy Strength To illustrate the construction of a stem-and-leaf diagram, con- sider the alloy compressive strength data in Table 6.2. We select as stem values the numbers 7, 8, 9, … , 24. The result- ing stem-and-leaf diagram is presented in Figure 6.4. The last column in the diagram is a frequency count of the number of leaves associated with each stem. Inspection of this display immediately reveals that most of the compressive strengths lie between 110 and 200 psi and that a central value is somewhere between 150 and 160 psi. Furthermore, the strengths are dis- tributed approximately symmetrically about the central value. The stem-and-leaf diagram enables us to determine quickly some important features of the data that were not immediately obvious in the original display in Table 6.2. In some data sets, providing more classes or stems may be desirable. One way to do this would be to modify the original stems as follows: Divide stem 5 into two new stems, 5L and 5U. 6.2 Stem-and-Leaf Diagrams 133 Stem 5L has leaves 0, 1, 2, 3, and 4, and stem 5U has leaves 5, 6, 7, 8, and 9.
  • Book cover image for: Applied Statistical Designs for the Researcher
    • Daryl S. Paulson(Author)
    • 2003(Publication Date)
    • CRC Press
      (Publisher)
    Finally, although EDA calculations are generally simple, in practice, they are tedious and time consuming to perform. The use of statistical soft-ware, particularly iflarge data sets are involved, is nearly mandatory. Fortu-nately, there are many software packages (e.g., SPSSX®, SAS®, MiniTab®) that contain many useful EDA application subroutines. By far the most user-friendly package is MiniTab, which we will use in tandem with the paper-and-pencil statistical analyses performed in this book. Let us now look at some individual EDA applications. I. STEM-AND-LEAF DISPLAYS As the applied researcher is aware, and as we have discussed, data sets (known as batches in EDA) come in variously shaped distributions. It is useful to know the shape of the data set(s) to ascertain that the data are nor-mally distributed, etc., prior to conducting a parametric statistical test. A stem-and-Ieaf display is a data-ordering and presentation procedure that provides a convenient and direct way to become acquainted with the data. When a data set contains only a few values, a stem-and-leafdisplay is simple to construct with pencil and paper. Larger data sets are more conveniently displayed by using a computer. The stem-and-leafdisplay was first presented by John Tukey [14] and is widely used throughout the statistical field. It is a type of frequency distribu-tion that combines the leftmost digit(s) of each data value (stem) with the next digit to the right (leaf) simultaneously in an ordered manner. Individual data values are also easy to recover from the display* because, unlike the *The recovered value may be only a two-or three-digit approximation to the original value if some digits on the right were truncated (see step 1in the example that follows). Exploratory Data Analysis 53 case of histograms,where data are grouped into categories and only the cate-gory frequencies are plotted, the numerical values do not disappear.
  • Book cover image for: Statistical Methods
    • Rudolf J. Freund, William J. Wilson(Authors)
    • 2003(Publication Date)
    • Academic Press
      (Publisher)
    Before the widespread use of automatic data recording equipment and computers, most data were laboriously recorded from laboratory manuals or similar records and then manually entered into calculators where the calcula-tions were usually performed in several stages. During this long and laborious process, it was relatively easy to spot unusual observations and, in general, to get a “feel” for the data and thus recognize the possible need for altering the analysis strategy. Certainly the automatic recording and computing equipment available today provide greater speed, convenience, and accuracy, as well as more com-plete and comprehensive analyses. However, these analyses are performed without the help of human intervention and may consequently result in beau-tifully executed and handsomely annotated computer output of inappropriate analyses on faulty data. Fortunately, the same computers that can so easily produce inappropriate analyses can just as easily be used to perform preliminary data screening to provide an overview of the nature of the data and thus provide information on unusual distributions and/or data anomalies. A variety of such procedures have been developed and many are available on most popularly used computer software. These procedures are called exploratory data analysis techniques or EDA , which was first introduced by Tukey (1977). We present here two of the most frequently used EDA tools: the stem and leaf plot and the box plot . The Stem and Leaf Plot The stem and leaf plot is a modification of a histogram for a ratio or inter-val variable that provides additional information about the distribution of the variable. The first one or two digits specify the class interval, called the “stem,” and the next digit (rounded if necessary) is used to construct increments of the bar, which are called the “leaves.” Usually in a stem and leaf plot, the bars are arranged horizontally and the leaf values are arranged in ascending order.
  • Book cover image for: Introduction to Statistics and Data Analysis
    • Roxy Peck, Chris Olsen, , Tom Short, Roxy Peck, Chris Olsen, Tom Short(Authors)
    • 2019(Publication Date)
    Stem-and-leaf displays can be useful to get a sense of a typical value for the data set, as well as a sense of how variable the values in the data set are. It is also easy to spot data values that are unusually far from the rest of the values in the data set. Such values are called outliers. The stem-and-leaf display of the wireless percentage data (Figure 3.11) does not show any outliers. DEFINITION Outlier: An unusually small or large data value. A precise rule for deciding when an observation is an outlier is given in Chapter 4. Interpret the results ➤ Stem-and-Leaf Displays When to Use Numerical data sets with a small to moderate number of observa-tions (does not work well for very large data sets) How to Construct 1. Select one or more leading digits for the stem values. The trailing digits (or sometimes just the first one of the trailing digits) become the leaves. 2. List possible stem values in a vertical column. 3. Record the leaf for every observation beside the corresponding stem value. 4. Indicate the units for stems and leaves someplace in the display. (continued) State Wireless % NJ 25.1 NY 31.1 NC 42.9 OH 45.8 State Wireless % PA 30.0 RI 34.6 SC 49.5 VA 41.1 State Wireless % VT 37.2 WV 37.2 Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
  • Book cover image for: Engineering Statistics
    • Douglas C. Montgomery, George C. Runger, Norma F. Hubele(Authors)
    • 2011(Publication Date)
    • Wiley
      (Publisher)
    In general, we should choose relatively few stems in comparison with the number of observations. It is usually best to choose between 5 and 20 items. Once a set of stems has been chosen, the stems are listed along the left-hand margin of the diagram. Beside each stem all leaves corresponding to the observed data values are listed in the order in which they are encountered in the data set. Compressive Strength 7 Stem Leaf Frequency 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 1 1 2 3 3 6 8 12 10 10 7 6 4 1 3 1 1 6 7 7 5 5 1 4 2 4 3 8 0 9 7 8 1 7 5 1 8 0 1 9 7 0 5 3 6 1 8 0 3 3 5 1 7 4 6 0 0 9 5 8 3 3 4 1 9 8 3 3 4 0 1 4 3 5 1 0 5 6 1 4 9 8 8 1 6 7 0 0 8 8 9 6 6 8 0 2 0 EXAMPLE 2-5 Figure 2-5 illustrates the stem-and-leaf diagram for 25 observations on batch yields from a chemical process. In Fig. 2-5a we used 6, 7, 8, and 9 as the stems. This results in too few stems, and the stem-and- leaf diagram does not provide much information about the data. In Fig. 2-5b we divided each stem into two parts, resulting in a display that more adequately displays the data. Figure 2-5c illustrates a stem- and-leaf display with each stem divided into five parts. There are too many stems in this plot, resulting in a display that does not tell us much about the shape of the data. ■ 2-2 STEM-AND-LEAF DIAGRAM 31 Figure 2-5 Stem- and-leaf displays for Example 2-5. 6z 6t 6f 6s 6e 7z 7t 7f 6L 6U 7L 7U 8L 8U 9L 9U 7s 7e 8z 8t 8f 8s 8e 9t 9z 9s 9f 9e 1 3 4 6 0 3 5 7 8 1 3 4 7 8 2 5 5 1 8 4 8 3 5 1 1 5 0 5 1 7 2 5 3 5 1 7 3 8 3 4 6 1 8 4 8 9 3 8 1 0 1 2 6 (a) 7 8 9 Stem Leaf 3 1 3 3 5 3 4 5 5 7 6 7 8 8 9 8 8 4 1 4 5 4 9 (b) Stem Leaf (c) Stem Leaf In some data sets, it may be desirable to provide more classes or stems. One way to do this is to modify the original stems as follows: Divide the stem 5 (say) into two new stems, 5L and 5U. The stem 5L has leaves 0, 1, 2, 3, and 4, and stem 5U has leaves 5, 6, 7, 8, and 9.
  • Book cover image for: Beginning Statistics with Data Analysis
    • Frederick Mosteller, Stephen E. Fienberg, Robert E.K. Rourke, Stephen E. Fienberg, Robert E.K. Rourke(Authors)
    • 2013(Publication Date)
    2. For each measurement, we place on the right side of the vertical bar its units’ digit opposite its listed tens’ digit. For example, 78 is entered thus: 7|8.
    3. We may rearrange the units’ digits on the right in ascending order of magnitude to get Fig. 2-2(b) .
             Figures 2-2(a) and 2-2(b) are called stem-and-leaf diagrams. Each row is called a stem; the numbers on the left of the vertical bar are stem ends, and the numbers on the right of the bar are called leaves. It is usually a good idea to use one-digit leaves. This may require the use of stem ends that have two or more digits.
        
    Figure 2-2 Stem-and-leaf diagram for areas of counties in Vermont (unit: 10 square miles).
    Source: Constructed from data in the World Almanac & Book of Facts 1980, edited by G. E. Delury, p. 244. New York: Newspaper Enterprise Association.
             The stem-and-leaf diagram is more informative than the tally. For example, we can recover to two digits the actual measurements from the stem-and-leaf diagram by inspection. The measurements in the 70–79 range are seen at once to be 71, 72, 78, 78. Indeed, the stem-and-leaf diagram offers most of the features of a histogram plus detailed information about the measurements.
             To see the full effectiveness of stem-and-leaf diagrams in organizing data, let us look at a larger set of measurements, a historical set that helped create the field of industrial quality control.
    EXAMPLE 2 Cost of electrical insulation. To reduce the cost of electrical insulation, a substitute insulating material was extensively tested for its electrical resistance. The first 50 measurements are given in Table 2-1 . Reading from left to right beginning at the top row gives the order in which measurements were made. Use these data to make (a) a tally and (b) a stem-and-leaf diagram.
    TABLE 2-1Electrical resistances of insulation (unit: 100 megohms)
    Source: W. A. Shewhart (1931). Economic Control of Quality of Manufactured Product
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.