Mathematics

Histograms

Histograms are graphical representations of the distribution of numerical data. They consist of a series of adjacent rectangles, where the area of each rectangle corresponds to the frequency of data within a certain range. Histograms are commonly used to visualize the shape, center, and spread of a dataset, making them a valuable tool for data analysis.

Written by Perlego with AI-assistance

10 Key excerpts on "Histograms"

  • Book cover image for: Statistical Reasoning in the Behavioral Sciences
    • Bruce M. King, Patrick J. Rosopa, Edward W. Minium(Authors)
    • 2018(Publication Date)
    • Wiley
      (Publisher)
    We measure width and height from the span of the graphed data rather than from the borders of the graph. It is important that a graph be large enough so that others can easily read the values (both along the horizontal and vertical axes and within the graph itself ). The graph of any frequency distribution should also have a succinct and informative title as well as labels on both axes. By now, you can see that some procedures are basically arbitrary and governed by conventions. These conventions are useful because they result in representations that match our expectations. Then we do not have to approach each new graph as a problem-solving exercise before absorbing its meaning. Keep these general procedures in mind as we look at the different types of graphed frequency distributions. 3.2 The Histogram Suppose that after constructing a grouped frequency distribution of your history exam scores (see Table 3.1), your history professor wants to display the results as a graph. There are two basic ways in which she could do this. One of them is the histogram shown in Figure 3.1. The histogram histogram a graph that consists of a series of rectangles, the heights of which represent frequency or relative frequency consists of a series of rectangles, each of which represents the scores in one of the class intervals of the distribution. The two vertical boundaries of a rectangle coincide with the real limits of the particular interval, whereas its height depends on the frequency of scores for that interval. We can use either raw frequencies or relative frequencies (proportions or percentages) with a his- togram. Changing from raw to relative frequencies requires only that we relabel the vertical axis. The steps in constructing a histogram are summarized as follows (see Table 3.1 and Figure 3.1, for examples): Step 1: Construct a frequency distribution in table form as described in Section 2.3.
  • Book cover image for: Reeds Vol 1: Mathematics for Marine Engineers
    • Kevin Corner, Leslie Jackson, William Embleton(Authors)
    • 2013(Publication Date)
    • Thomas Reed
      (Publisher)
    Histograms These are used to represent frequency distributions, and consist of a series of touching rectangles with bases on a horizontal axis. Each rectangle represents a value or class in the distribution and its height is drawn so that the area of the rectangle is proportional to the frequency it represents. Thus the total area of the histogram represents the total frequency in the distribution. Figure 14.1 shows the histogram of the sidereal hour angle data given in Table 14.4. The labelling of the classes is important since this is a discrete frequency distribution and yet the histogram demands a series of touching rectangles drawn on a continuous scale. It is as though each rectangle is extended by 1 / 2 ◦ in each direction, thus the second 30 60 90 120 150 180 210 240 270 300 330 360 2 4 6 8 Frequency Sidereal hour angle Figure 14.1 Statistics • 353 1100 1200 1300 1400 1500 1600 1700 1800 0 1 2 3 4 5 6 7 8 9 10 Time of day Number of ships passing Felixstowe Ledge buoy per hour Figure 14.2 one is drawn from 29 1 / 2 ◦ to 39 1 / 2 ◦ . In a continuous distribution there is no need for this extension and it is most usual to label the end of the classes. Hence Figure 14.2 shows the histogram for the ships’ arrival patterns shown in Table 14.3. Frequency Polygons Instead of a histogram, which is rather tedious to draw, a frequency polygon may be used, as shown in Figure 14.3. This is just a straight line graph joining the data points together. The points are plotted at the midpoint of the class, for example, the point for the data representing 13.00 to 14.00 is plotted at 13.30. Cumulative Frequency Curves (Ogives) This plots the running total of the frequencies against bits variable. The data representing the sidereal hour angles (Table 14.4) becomes Table 14.5.
  • Book cover image for: Introduction to Statistics and Data Analysis
    • Roxy Peck, Chris Olsen, , Tom Short, Roxy Peck, Chris Olsen, Tom Short(Authors)
    • 2019(Publication Date)
    Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 3.3 Displaying Numerical Data: Frequency Distributions and Histograms 101 Also note that the rectangle representing the score range 400 to 499 actually extends from 399.5 to 499.5 on the score scale. This is similar to what happens in Histograms for dis-crete numerical data where there is no grouping. For example, in Figure 3.17 the rectangle representing 2 is centered at 2 but extends from 1.5 to 2.5 on the number of partners scale. Frequency Distributions and Histograms for Continuous Numerical Data For continuous data, such as observations on reaction time (in seconds) or weight of air-line passenger carry-on luggage (in pounds), there are no natural categories. In this case, we define our own categories. For carry-on luggage weight, we might expect weights up to about 30 pounds. One way to group the weights into 5-pound intervals is shown in Figure 3.20. Then each observed data value could be classified into one of these intervals. The intervals used are sometimes called class intervals . The class intervals play the same role that the categories or individual values played in frequency distributions for categori-cal or discrete numerical data. 0 5 10 15 20 25 30 There is one further difficulty that needs to be addressed. Where should we place an observation such as 20, which falls on a boundary between classes? Our convention is to define intervals so that such an observation is placed in the upper rather than the lower class interval.
  • Book cover image for: Statistics for the Social Sciences
    eBook - PDF

    Statistics for the Social Sciences

    A General Linear Model Approach

    Histograms 71 or wrong answer, although too few bars will result in a histogram that provides little useful information, while too many bars may make the histogram too complex to be useful. 0 0.5 1 1.5 2 2.5 3 3.5 4 12 17 22 27 32 37 Frequency Sibling's age Figure 4.7 Blank histogram with interval size widths of 5. 0 0.5 1 1.5 2 2.5 3 3.5 4 12 17 22 27 32 37 Frequency Sibling's age Figure 4.8 Histogram of sibling’ s ages from the Waite et al. (2015) study with intervals widths of 5. 0 1 2 3 4 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Frequency Sibling's age Figure 4.6 Histogram representing siblings’ ages from the Waite et al. (2005) study. 72 Visual Models Describing Shapes of Histograms Histograms can take on many shapes, and a special terminology has been developed to describe them. One way to describe distributions is in terms of how symmetrical they are; another is in terms of the height of parts of the distribution. Both ways of describing distributions require a standard to compare against. That standard is the normal distribution. The Normal Distribution. The normal distribution is a special distribution, as shown in Figure 4.9. The normal distribution has several characteristics, all of which are apparent in Figure 4.9: • The normal distribution is a symmetrical histogram. In other words, the left and right halves of the distribution are mirror images of one another. • The peak is in the exact middle of the distribution. • Mean = median = mode. • The tails of the distribution never touch the x-axis. This means that the normal distribution theoretically extends from –∞ to +∞. Figure 4.9 is a histogram. However, it differs from other Histograms (e.g., Figures 4.3 and 4.8) because it appears to consist of a smooth curve, instead of boxed intervals.
  • Book cover image for: Understandable Statistics
    No longer available |Learn more

    Understandable Statistics

    Concepts and Methods, Enhanced

    • Charles Henry Brase, Corrinne Pellillo Brase(Authors)
    • 2016(Publication Date)
    WHAT DO Histograms AND RELATIVE FREQUENCY Histograms TELL US? Histograms and relative frequency Histograms show us how the data are distributed. By looking at such graphs, we can tell • if the data distribution is more symmetric, skewed, or bimodal; • if there are possible outliers; • which data intervals contain the most data; • how spread out the data are. In the next chapter we will look at measures of center and spread of data. Histograms help us visualize such measures. Copyright 201 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. Section 2.1 Frequency Distributions, Histograms, and Related Topics 51 PROCEDURE HOW TO MAKE AN OGIVE 1. Make a frequency table showing class boundaries and cumulative frequencies. 2. For each class, make a dot over the upper class boundary at the height of the cumulative class frequency. The coordinates of the dots are (upper class boundary, cumulative class frequency). Connect these dots with line segments. 3. By convention, an ogive begins on the horizontal axis at the lower class boundary of the first class. C UMULATIVE -FREQUENCY TABLE AND OGIVE Aspen, Colorado, is a world-famous ski area. If the daily high temperature is above 40 8 F, the surface of the snow tends to melt. It then freezes again at night. This can result in a snow crust that is icy. It also can increase avalanche danger. Table 2-11 gives a summary of daily high temperatures ( 8 F) in Aspen during the 151-day ski season.
  • Book cover image for: Statistics for Anthropology
    Figures 2.2 and 2.3 show bar graphs of the number of children produced by all 100 women considered as a single group, and divided by religion (all of these were obtained by SAS). Figure 2.4 shows yet another option, namely, a horizontal bar chart, which in addition includes a frequency distribution of both samples. Although it is a bit “busy” this graph/chart combines a lot of useful information. Pie charts are another popular choice when it comes to graphing discrete variables, either discontinuous numeric or qualitative. I used the data shown in Table 2.4 (women divided into height categories) to produce the pie chart shown in Figure 2.5. 2.2.2 Histograms Histograms are rather similar to bar graphs, but are used for continuous numeric data. Since such data are in principle infinite (there are infinite numbers of possible heights between 170 and 171 cm), the bars of a histogram do “touch” each other. Most computer programs will allow you to choose how many bars to show and how narrow or broad those bars should be. In addition, both PASW and SAS allow you to superimpose a normal curve on your histogram, which will help you judge visually if your data depart dramatically from a normal distribution. What a normal distribution is, and how we test if a sample is normally distributed will be discussed below. Figure 2.6 is a histogram of weight of the women shown as a single sample, with a normal curve superimposed. 2.2.3 Polygons These graphs are also applied to continuous numerical data, and are frequently preferred over Histograms. Some computer packages refer to them as X*Y graphs, in which X represents the variable and Y the frequency of its various outcomes. Figures 2.7 and 2.8 show two polygons of height, for the women in our sample divided by religious affiliation. 2.2.4 Box plots This is a very popular way to convey graphically basic information about the distribution of a sample.
  • Book cover image for: Statistics for the Social Sciences
    eBook - PDF

    Statistics for the Social Sciences

    A General Linear Model Approach

    0 1 2 3 4 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Frequency Sibling's age Figure 3.6 Histogram representing siblings’ ages from the Waite et al. (2005) study. Histograms 45 Describing Shapes of Histograms Histograms can take on many shapes, and a special terminology has been developed to describe them. One way to describe distributions is in terms of how symmetrical they are; another is in terms of the height of parts of the distribution. Both ways of describing distributions require a standard to compare against. That standard is the normal distribution. The Normal Distribution. The normal distribution is a special distribution, as shown in Figure 3.9. The normal distribution has several characteristics, all of which are apparent in Figure 3.9: 0 0.5 1 1.5 2 2.5 3 3.5 4 12 17 22 27 32 37 Frequency Sibling's age Figure 3.7 Blank histogram with interval size of 5. 0 0.5 1 1.5 2 2.5 3 3.5 4 12 17 22 27 32 37 Frequency Sibling's age Figure 3.8 Histogram of sibling’ s ages from the Waite et al. (2015) study with intervals that are 5 units wide. 46 Visual Models • The normal distribution is a symmetrical histogram. In other words, the left and right halves of the distribution are mirror images of one another. • The peak is in the exact middle of the distribution. • Mean = median = mode. • The tails of the distribution never touch the x-axis. This means that the normal distribution theoretically extends from –∞ to +∞. Figure 3.9 is a histogram. However, it differs from other Histograms (e.g., Figures 3.3 and 3.8) in that it appears to consist of a smooth curve, instead of boxy intervals. This is because the normal distribution is idealized; instead of a limited number of intervals, the normal distribution theoretically has an infinite number of intervals, all of which are extremely narrow. When viewed together, the intervals lose their boxiness and instead appear as a smooth curve.
  • Book cover image for: Understanding Basic Statistics, International Metric Edition
    • Brase/Brase, Charles Henry Brase, Corrinne Pellillo Brase(Authors)
    • 2016(Publication Date)
    All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 42 Chapter 2 ORGANIZING DATA Frequency Distributions, Histograms, and Related Topics FOCUS POINTS • Organize raw data using a frequency table. • Construct Histograms, relative-frequency Histograms, and ogives. • Recognize basic distribution shapes: uniform, symmetric, skewed, and bimodal. • Interpret graphs in the context of the data setting. FREQUENCY TABLES When we have a large set of quantitative data, it’s useful to organize it into smaller intervals or classes and count how many data values fall into each class. A frequency table does just that. A frequency table partitions data into classes or intervals of equal width and shows how many data values are in each class. The classes or intervals are con-structed so that each data value falls into exactly one class. Frequency table Constructing a frequency table involves a number of steps. Example 1 demonstrates the steps. F REQUENCY TABLE A task force to encourage car pooling did a study of one-way commuting distances of workers in the downtown Dallas area. A random sample of 60 of these workers was taken. The commuting distances of the workers in the sample are given in Table 2-1. Make a frequency table for these data. SOLUTION: (a) First decide how many classes you want. Five to 15 classes are usually used. If you use fewer than five classes, you risk losing too much information. If you use more than 15 classes, the data may not be sufficiently summarized. Let the SECTION 2.1 Although this section is not conceptually difficult, it may require a little more time.
  • Book cover image for: Applied Statistics and Probability for Engineers
    • Douglas C. Montgomery, George C. Runger(Authors)
    • 2018(Publication Date)
    • Wiley
      (Publisher)
    6.3 Frequency Distributions and Histograms 137 0 100 150 200 250 Strength 5 10 Frequency FIGURE 6.8 A histogram of the compressive strength data with 17 bins. 80 0 10 20 100 120 140 160 180 200 220 240 Strength Frequency FIGURE 6.9 A histogram of the compressive strength data with nine bins. Figure 6.8 is a histogram of the compressive strength data with 17 bins. We have noted that Histograms may be relatively sensitive to the number of bins and their width. For small data sets, Histograms may change dramatically in appearance if the number and/or width of the bins changes. Histograms are more stable and thus reliable for larger data sets, preferably of size 75 to 100 or more. Figure 6.9 is a histogram for the compressive strength data with nine bins. This is similar to the original histogram shown in Figure 6.7. Because the number of observations is moderately large (n = 80), the choice of the number of bins is not especially important, and both Figures 6.8 and 6.9 convey similar information. Figure 6.10 is a variation of the histogram available in some software packages, the cumu- lative frequency plot. In this plot, the height of each bar is the total number of observations that are less than or equal to the upper limit of the bin. Cumulative distributions are also use- ful in data interpretation; for example, we can read directly from Figure 6.10 that approximately 70 observations are less than or equal to 200 psi. When the sample size is large, the histogram can provide a reasonably reliable indicator of the general shape of the distribution or population of measurements from which the sam- ple was drawn. See Figure 6.11 for three cases. The median is denoted as  x. Generally, if the data are symmetric, as in Figure 6.11(b), the mean and median coincide. If, in addition, the data have only one mode (we say the data are unimodal), the mean, median, and mode all coincide.
  • Book cover image for: Finite Mathematics
    eBook - PDF

    Finite Mathematics

    An Applied Approach

    • Michael Sullivan(Author)
    • 2011(Publication Date)
    • Wiley
      (Publisher)
    The numbers in the cf column are called the cumulative frequencies. NOW WORK PROBLEM 5(f). 6 Draw an Ogive The graph in which the horizontal axis represents class intervals and the vertical axis represents cumulative frequencies is called an ogive (pronounced “Oh jive”). EXAMPLE 7 Draw an Ogive Use the cumulative frequency table in Table 9 to draw an ogive. SOLUTION Draw two axes and put the class intervals on the horizontal axis and the cumulative frequencies on the vertical axis. Plot 0 at the lower class limit of the smallest interval and then plot each cumulative frequency at the upper class limit of each class interval. Finally connect the points with straight line segments. See Figure 7. 60 70 80 90 100 110 120 7 14 21 28 35 42 49 56 63 70 Class interval Cumulative frequency 50 FIGURE 7 ■ NOW WORK PROBLEM 5(g). 528 Chapter 9 Statistics 7 Identify the Shape of a Distribution One advantage of using Histograms to display data is that they show the shape of a distribution. While many shapes are possible, two shapes are of particular interest. When the data in the frequency table are evenly spaced around a central point, the distribution is called symmetric. Symmetric distributions are either bell shaped or uniform. Common examples of symmetric distributions include IQ scores (bell shaped) and the outcomes when a fair die is repeatedly tossed (uniform). Histograms depicting symmetric distributions are shown in Figures 8(a) and (b). (a) Set up a frequency table for the above data. (b) Draw a line chart for the data. (c) Group the data into class intervals of size 2, beginning with the interval 24–25.9. (d) Build the histogram for the data. (e) Draw the frequency polygon for this histogram. (f) Find the cumulative frequencies. (g) Draw the ogive.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.