Mathematics
Dot Plot
A dot plot is a simple way to display data using dots or points along a number line. Each dot represents a single data point, and the frequency of the data values is shown by the number of dots at each position. Dot plots are useful for visualizing the distribution and clustering of data points in a set.
Written by Perlego with AI-assistance
Related key terms
1 of 5
12 Key excerpts on "Dot Plot"
- eBook - PDF
Visual Statistics
Seeing Data with Dynamic Interactive Graphics
- Forrest W. Young, Pedro M. Valero-Mora, Michael Friendly(Authors)
- 2011(Publication Date)
- Wiley-Interscience(Publisher)
00 SATMath 400 600 o o 1 Figure 6.3 Three systematically jittered dotplots. l.Data gathered by the first author from students in his course for introductory psychological statistics. The data were gathered on the first day of the course over several years. The variables concerned the attitudes of the students toward statistics, their experience with mathematics and computers, and their grade average and scores on the SAT (a nationally standardized test used widely in the United States as part of the university admissions procedure), along with their age, gender, and so on. Data availa-ble with ViSta. 192 6.3 Univariate Plots 6.3.2 Boxplots Dotplots can be enhanced with the capability of adding or removing schematics, where the schematics provide a context in which we can evaluate the individual observations more completely. These schematics provide information about the center and spread of the data, such as the mean or median, the quartiles, the standard deviation, and so on. We discuss two schematics that are commonly used to enhance dotplots: boxplots and diamond plots. Figure 6.4 shows two boxplots on the left and a diamond plot in the third position. The boxplot and diamond plot are shown superimposed on the right-hand side. We review boxplots in this section and diamond plots in the next. The boxplot is the most useful plot that results from adding a schematic to a dotplot. The schematic of the boxplot is based on the median and other quantile measures, as is described below. Boxplots were first described by Tukey (1977), who added some important variations shortly thereafter (McGill et al., 1978). Figure 6.4 shows the dot-plot in Figure 6.2 with a boxplot drawn on top of it. The elements of the boxplot are the following: • Box: The horizontal line in the center is located at the median: thus, half the data are above this line, half below. - eBook - ePub
- Amar Sahay(Author)
- 2016(Publication Date)
- Business Expert Press(Publisher)
Figure 3.16 shows the Dot Plot of the data that represents the spot speed of 100 cars at a 65 mph speed limit zone. The plot shows that most of the cars were at or below the speed limit. There were 13 cars over the speed limit of 65 mph. The shape of the data is approximately symmetrical.Figure 3.16 Dot Plot of 100 Cars at 65 mph Speed ZoneFigure 3.17 Dot Plot of Number of Cars SoldExample 3.3The Dot Plot in Figure 3.17Bar Chartsshows the number of cars sold by a dealership over a period of 100 days. The numbers of cars are the total number sold at four different locations of the same dealership. The horizontal axis shows the number of cars sold and the vertical axis shows the days. The first value on the horizontal axis is 2 with three dots above it. This means that three cars were sold in the first two days. The total number of dots is 100, indicating the number sold over 100 days.Bar charts are one of the widely used charts to display categorical data. These charts can be used to display monthly or quarterly sales, revenue, and profits for a company. Figure 3.18 shows the monthly sales of a company with two columns of data; the first column is the categorical variable (month) and the second column contains the numerical values (e.g., sales in dollars). Figure 3.19 shows a variation of the bar chart.Figure 3.18 A Bar chart of Monthly SalesFigure 3.19 Connected Line over the Bar chart of Sales vs. MonthExample 3.4: More Examples of Bar Chart Categorical Dataa) Figure 3.20 shows a bar chart showing the gold price from 1975 to 2011.The above chart is useful in visualizing the trend and also the percent increase and decrease in the value over the years. For example, Percent increase in the price of gold (per ounce) between 1980 and 2011 can be determined as: - eBook - ePub
- Craig Gygi, Bruce Williams, Neil DeCarlo(Authors)
- 2012(Publication Date)
- For Dummies(Publisher)
histogram takes the data from the Dot Plot and replaces the dots with bars. The following sections show you how to generate these helpful graphics and understand what the graphs are telling you.Creating your own Dot Plots and histogramsAfter collecting measurements or data for a characteristic, create a Dot Plot for it by using the following steps:1.Create a horizontal line that represents the scale of measure for the characteristic.This scale should be in whatever measure best quantifies the aspect of the characteristic you’re interested in — for example, millimeters for length, pounds for weight, minutes for time, or number of defects found on an inspected part.2.Divide the horizontal scale of measure into equal chunks or “buckets” along its length. Select a bucket width that creates about 10 to 20 equal divisions between the largest and smallest observed values for the characteristic. 3.For each observed measurement of the characteristic, locate its value along the horizontal scale and place a dot for it in its corresponding bucket. If another observed measurement falls into the same “bucket,” stack the second (or third, or fourth) dot above the previous one. 4.Repeat Step 3 until you’ve placed all the observed measurements onto the plot.To create a histogram (so that you can impress your peers with a graph that has a much more complicated-sounding name), replace each of the stacks of dots with a solid vertical bar of the same height as its corresponding stack of dots. Note: The vertical dimension on a Dot Plot or histogram is sometimes called frequency or count (refer to the following section). - Kimmo Vehkalahti, Brian S. Everitt(Authors)
- 2018(Publication Date)
- CRC Press(Publisher)
The first example compares the pie chart of 10 percentages with an alternative graphic, the Dot Plot. The plots are shown in Figures 2.3 and 2.4. The 10 percentages represented by the two graphics have a bimodal distribution; odd-numbered observations cluster around 8%, and even-numbered observations cluster around 12%. Furthermore, each even value is shifted with respect to the preceding odd value by about 4%. This pattern is far easier to spot in the Dot Plot than in the pie chart. Figure 2.3 Pie chart for 10 percentages. (Suggested by Cleveland, 1994. Used with permission from Hobart Press.) Figure 2.4 Dot Plot for 10 percentages. Dot Plots for the crime data in Table 2.1 are shown in Figure 2.5, and these are also more informative than the corresponding pie charts in Figure 2.1. Figure 2.5 Dot Plots for drinkers’ and abstainers’ crime percentages. The second example given by Cleveland begins with the diagram shown in Figure 2.6, which originally appeared in Vetter (1980). The aim of the diagram is to display the percentages of degrees awarded to women in several disciplines of science and technology during three time periods. At first glance the labels on the diagram suggest that the graph is a standard divided bar chart with the length of the bottom division of each bar showing the percentage for doctorates, the length of the middle division showing the percentage for master’s degrees, and the top division showing the percentage for bachelor’s degrees. A little reflection shows that this interpretation is not correct since it would imply that, in most cases, the percentage of bachelor’s degrees given to women is lower than the percentage of doctorates. Closer examination of the diagram reveals that the three values of the data for each discipline during each time period are determined by the three adjacent vertical dotted lines- eBook - ePub
Statistical Data Analysis Explained
Applied Environmental Statistics with R
- Clemens Reimann, Peter Filzmoser, Robert Garrett, Rudolf Dutter(Authors)
- 2011(Publication Date)
- Wiley(Publisher)
This plot is typically displayed as an elongated rectangle. Each value is plotted at its correct position along the x -axis and at a position selected by chance (according to a random uniform distribution) along the y -axis (Figure 3.2, lower diagram). This simple graphic can provide important insight into structure in the data. Figure 3.2 Evolution of the one-dimensional scatterplot demonstrated using Sc as measured by instrumental neutron activation analysis (INAA) in the samples of the Kola C-horizon In Figure 3.2 (stacked and one-dimensional scatterplot) a significant feature is apparent that would be important to consider if this variable were to be used in a more formal statistical analysis. The data were reported in 0.1 mg/kg steps up to a value of 10 mg/kg and then rounded to full 1 mg/kg steps – this causes an artifical “discretisation” of all data above 10 mg/kg. 3.2 The histogram One of the most frequently used diagrams to depict a data distribution is the histogram. It is constructed in the form of side-by-side bars. Within a bar each data value is represented by an equal amount of area. The histogram permits the detection at one glance as to whether a distribution is symmetric (i.e. the same shape on either side of a line drawn through the centre of the histogram) or whether it is skewed (stretched out on one side – right or left skewed). It is also readily apparent whether the data show just one maximum (unimodal) or several humps (multimodal distribution). The parts far away from the main body of data on either side of the histograms are usually called the tails. The length of the tails can be judged. The existence or non-existence of straggling data (points that appear detached from the main body of data) at one or both extremes of the distribution is also visible at one glance - eBook - PDF
- Elinor Jones, Simon Harden, Michael J. Crawley(Authors)
- 2022(Publication Date)
- Wiley(Publisher)
With a small dataset, you have the luxury of being able to plot all data points without the graphic becoming cluttered and unreadable, so why not plot everything? For large datasets, use a boxplot or other type of display. A dotplot will be too busy and in extreme cases will show just a dense cloud or line of points where the underlying shape will be invisible. GRAPHICS 263 2 4 6 8 10 12 Caterpillar growth (a) Default dotplot 2 4 6 8 10 12 Caterpillar growth ● ● ● ● ● ● ● ● ● (b) Dotplot with jitter and different plotting symbol Figure 5.19 Dotplots of the caterpillar data Black Blond Brown Red Frequency 0 50 100 150 200 250 Figure 5.20 Barplot showing frequency of different hair colour types 5.2.6 Bar charts Bar charts are a great way of displaying categorical data (or perhaps discrete numeric data with limited unique values). Bar plots come in many different flavours, but the simplest is to place the categories of the variable on the x-axis with the height of the bars representing the frequency of observations that fall into that category. These are simple enough to create, as we demonstrate with the hair and eye colour dataset from R. We’ll plot the hair colour of the participants, but first we need to create a table and then pass on this table to the barplot () function. The result is in Figure 5.20. hair_eye <- read.table ("hair_eye.txt", header = T) table_hair <- table (hair_eye$Hair) table_hair 264 THE R BOOK Black Blond Brown Red 108 127 286 71 barplot (table_hair, col = hue_pal()(4), ylab = "Frequency") 5.2.7 Pie charts A word of warning here: Statisticians do not like pie charts, and for good reason. Avoid these as far as possible, and let the following example act as a warning. When creating graphics to represent data, we want to make sure that the main messages are clear to our audience. The problem with pie charts is that we humans aren’t very good at distin- guishing, e.g. - No longer available |Learn more
- William Mendenhall, Robert Beaver, Barbara Beaver, , William Mendenhall, Robert Beaver, Barbara Beaver(Authors)
- 2019(Publication Date)
- Cengage Learning EMEA(Publisher)
You need a different way to graph this type of data! The simplest graph for quantitative data is the dotplot . For a small set of measure-ments—for example, the set 2, 6, 9, 3, 7, 6—you can simply plot the measurements as points on a horizontal axis, as shown in Figure 1.9(a). For a large data set, however, such as the one in Figure 1.9(b), the dotplot can be hard to interpret. ■ ● Stem and Leaf Plots Another simple way to display the distribution of a quantitative data set is the stem and leaf plot . This plot uses the actual numerical values of each data point. How to Construct a Stem and Leaf Plot 1. Divide each measurement into two parts: the stem and the leaf. 2. List the stems in a column, with a vertical line to their right. 3. For each measurement, record the leaf portion in the same row as its corresponding stem. 4. Order the leaves from lowest to highest in each stem. 5. Provide a key to your stem and leaf coding so that the reader can re-create the actual measurements if necessary. Need to Know... ? E X A M P L E 1.7 Table 1.7 lists the prices (in dollars) of 19 different brands of walking shoes. Use a stem and leaf plot to display the data. 90 70 70 70 75 70 65 68 60 74 70 95 75 70 68 65 40 65 70 ■● Table 1.7 Prices of Walking Shoes Solution To create the stem and leaf, divide each observation between the ones and the tens place. The number to the left is the stem; the number to the right is the leaf. Thus, for the shoes that cost $65, the stem is 6 and the leaf is 5. The stems, ranging from 4 to 9, are listed in Figure 1.10, along with the leaves for each of the 19 measurements. If you indicate that the leaf unit is 1, the reader will realize that the stem 6 and the leaf 8, for example, represent the number 68, recorded to the nearest dollar. Figure 1.9 Dotplots for small and large data sets 2 3 4 5 6 7 8 9 Small Set (a) 0.98 1.05 1.12 1.19 1.26 1.33 1.40 1.47 Large Set (b) Copyright 2020 Cengage Learning. - eBook - PDF
Data Analysis for the Geosciences
Essentials of Uncertainty, Comparison, and Visualization
- Michael W. Liemohn(Author)
- 2023(Publication Date)
- American Geophysical Union(Publisher)
With the inclusion of the histograms, though, a more complete understanding can be developed for how the two data sets relate to each other. Data set #2 Data set #1 Data set #2 Data set #1 Figure 2.11 Another illustrative example of a scatterplot between two data sets, this time shown differently. On the left, a second set of values are shown with symbols instead of points, while on the right, error bars have been added to the points in both the x and y directions. Quick and Easy for Section 2.4.2 The scatter plot is two-dimensional ren- dering of one data set against another, and provides an excellent first look at their relationship. Data Analysis for the Geosciences 42 2.4.3 The Box Plot One way to clearly signal that the distribution is not a Gaussian is to use the box plot to visualize the number set. It is often called a box-and-whisker plot because of its standard format, with a narrow rectangle indicating the location of the bulk of the distribution and then long thin lines demarking some outlier extent of the values. It is nearly always defined from quantiles, which are found by sorting the number set in ascending order and finding the value at a certain percentage of N. That is, quantile is found as the percentage of the rank order of a value divided by N, the total number count in the set. A box plot is usually made with the 50% quantile—the median— representing the centroid and other quantiles used for the box extent and even more extreme quantiles as the whisker length. Figure 2.13 shows a typical box plot. The box is usually, nearly always, defined by something called the interquartile range, or IQR, with the 25% and 75% percentile x values defining the Box plot: A method of displaying a number set, often a subset of a larger number set, with the use of quantiles to show the spread and full range. - eBook - ePub
Statistics for Psychologists
An Intermediate Course
- Brian S. Everitt(Author)
- 2001(Publication Date)
- Taylor & Francis(Publisher)
bimodal distribution); odd-numbered bands lie around the value 8% and even-numbered bands around 12%. Furthermore, the shape of the pattern for the odd values as the band number increases is the same as the shape for the even values; each even value is shifted with respect to the preceding odd value by approximately 4%.Fig. 2.3. Pie chart for lo percentages. (Reproduced with permission from Cleveland, 1994.)Fig. 2.4. Dot Plot for 10 percentages. (Reproduced with permission from Cleveland, 1994.)Dot Plots for the crime data in Table 2.1 (see Figure 2.5 ) are also more informative than the pie charts in Figure 2.1 , but a more exciting application of the Dot Plot is provided in Carr (1998). The diagram, shown in Figure 2.6 , gives a particular contrast of brain mass and body mass for 62 species of animal. Labels and dots are grouped into small units of five to reduce the chances of matching error. In addition, the grouping encourages informative interpretation of the graph. Also note the thought-provoking title.Fig. 2.5. Dot Plots for (a) drinker and (b) abstainer crime percentages.2.3. Histograms, Stem-and-Leaf Plots, and Box PlotsFig. 2.6. Dot Plot with positional linking (taken with permission from Carr, 1998).The data given in Table 2.2 shown the heights and ages of both couples in a sample of 100 married couples. Assessing general features of the data is difficult with the data tabulated in this way, and identifying any potentially interesting patterns is virtually impossible. A number of graphical displays of the data can help. For example, simple histograms of the heights of husbands, the heights of wives, and the difference in husband and wife height may be a good way to begin to understand the data. The three histograms are shown in Figures 2.7 and 2.8 . All the height distributions are seen to be roughly symmetrical and bell shaped, perhaps roughly normal? Husbands tend to be taller than their wives, a finding that simply reflects that men are on average taller than women, although there are a few couples in which the wife is taller; see the negative part of the x - eBook - PDF
- Prem S. Mann(Author)
- 2016(Publication Date)
- Wiley(Publisher)
2.4 Dotplots 2.4 Dotplots 65 Outliers or Extreme Values Values that are very small or very large relative to the majority of the values in a data set are called outliers or extreme values. EXAMPLE 2–12 Ages of Students in a Night Class A statistics class that meets once a week at night from 7:00 PM to 9:45 PM has 33 students. The following data give the ages (in years) of these students. Create a dotplot for these data. 34 21 49 37 23 22 33 23 21 20 19 33 23 38 32 31 22 20 24 27 33 19 23 21 31 31 22 20 34 21 33 27 21 Solution To make a dotplot, we perform the following steps. Step 1. The minimum and maximum values in this data set are 19 and 49 years, respectively. First, we draw a horizontal line (let us call this the numbers line) with numbers that cover the given data as shown in Figure 2.22. Note that the numbers line in Figure 2.22 shows the values from 19 to 49. Creating a dotplot. Figure 2.22 Numbers line. 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 Step 2. Next we place a dot above the value on the numbers line that represents each of the ages listed above. For example, the age of the first student is 34 years. So, we place a dot above 34 on the numbers line as shown in Figure 2.23. If there are two or more observations with the same value, we stack dots above each other to represent those values. For example, as shown in the data on ages, two students are 19 years old. We stack two dots (one for each student) above 19 on the numbers line, as shown in Figure 2.23. After all the dots are placed, Figure 2.23 gives the complete dotplot. 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 Ages of Students Figure 2.23 Dotplot for ages of students in a statistics class. As we examine the dotplot of Figure 2.23, we notice that there are two clusters (groups) of data. Eighteen of the 33 students (which is almost 55%) are 19 to 24 years old, and 10 of the 33 students (which is about 30%) are 31 to 34 years old. - eBook - ePub
- Robert Andersen, David A. Armstrong II(Authors)
- 2021(Publication Date)
- SAGE Publications Ltd(Publisher)
Some of the techniques covered in this chapter – such as bar charts and Dot Plots for categorical variables and histograms and boxplots for quantitative variables – are well known even to inexperienced researchers. Still, we provide some extensions and variants of these methods that even advanced researchers should find useful. We also discuss some less well-known, but useful, techniques. The quantile comparison plot is often more effective than a histogram for revealing skewed distributions and outliers. Plotting density estimates, which effectively ‘smooths’ the histogram, can also be more insightful than a standard histogram, especially when comparing several distributions or trying to compare an observed distribution to a theoretical one. We will also discuss violin plots, which have some similar features to boxplots but contain more information about the distribution of the variable.3.1 Displaying Distributions of Categorical Variables
If there are relatively few categories, the distribution of a categorical variable can be effectively revealed in a table that reports the counts of observations (i.e., a frequency distribution) – or even better, percentages or proportions (i.e., a relative frequency distribution) – in each category. As the number of categories increases, however, frequency tables become progressively more difficult to interpret. For variables with many categories, graphical methods such as bar charts or Dot Plots are usually more effective than tables.We propose three criteria to govern the decision to use a bar chart (or Dot Plot) instead of a table: the number of categories; the relative size of the number of observations in each category; and the amount of detail the reader is expected to understand. If readers are expected to know the precise count or percentage of observations in each category, reporting these values in a table is the best option. If there are more than a few categories, and the differences in the observations within them are difficult to comprehend from a table, a graph should be preferred. If the number of categories is not too unwieldy, the number of observations (or percentage or proportion) in each category can be included at the top of each bar in bar chart. On the other hand, if one group has a relatively large proportion of the data, the resolution in a bar chart could be quite low and a table may be better. - Michael Friendly, Howard Wainer(Authors)
- 2021(Publication Date)
- Harvard University Press(Publisher)
6 The Origin and Development of the Scatterplot As we saw in Chapter 5, most modern forms of data graphics—pie charts, line graphs, and bar charts—can generally be attributed to William Playfair in the period 1785–1805. All of these, even though presented as two-dimensional graphs, were essentially one-dimensional in their view of data. They showed a single quantitative variable (such as land area or value of trade) broken down by a categorical variable, as in a pie chart or bar chart, or plotted over time (perhaps with separate curves for imports and exports), as in a line graph. In the development of a language and taxonomy of graphs, Playfair’s graphs and other visual representations of data in this time can considered 1.5D— more than just a single variable shown, but not quite enough to qualify for 2D status. In Playfair’s visual understanding, the horizontal axis in his plots most often bound to time, forcing him to use other means to show relations with other variables. The next major invention in data graphics—the first fully two-dimensional one—was the scatterplot. Indeed, among all forms of statistical graphics, the scatterplot may be considered the most versatile and generally useful inven- tion in the entire history of statistical graphics. 1 Essential characteristics of a scatterplot are that two quantitative variables are measured on the same observational units (workers); the values are plotted as points referred to perpendicular axes; and the goal is to show something about the relation between these variables, typically how the ordinate variable, y, varies with the abscissa variable, x. Figure 6.1 shows a typical, if simplistic, modern scatterplot. It relates the number of years of experience of some workers on the horizontal (x) axis to their current annual salary on the vertical (y) axis. The experience and salary
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.











