Mathematics

Comparing Data

Comparing data involves analyzing and evaluating different sets of information to identify similarities, differences, patterns, and trends. This process often includes using mathematical techniques such as calculating measures of central tendency, dispersion, and correlation to make meaningful comparisons between data sets. By comparing data, mathematicians can draw conclusions and make informed decisions based on the information at hand.

Written by Perlego with AI-assistance

3 Key excerpts on "Comparing Data"

  • Book cover image for: Elementary Statistics for Effective Library and Information Service Management
    • Leo Egghe, Ronald Rousseau(Authors)
    • 2003(Publication Date)
    • Routledge
      (Publisher)
    Part 2 Descriptive Statistics
    In the previous part we discussed “data” and how to collect them. In this part and the next we will refer to these as “raw data”. Indeed, usually one has a large set of numbers and it is important to interpret them: obviously we want to see tendencies, overall properties and so on.
    This can be done in several ways, but basically there are two possible goals, once the data are collected.
    • We want to present the data in a smooth, streamlined way so that conclusions are easy to draw. Our only goal is then to understand and tell something about the data itself. This is the topic of part 2 .
    • We consider the data as a sample in a much larger universe. Using this sample we want to draw conclusions about the measured property of the entire universe. This is the topic of part 3 and is not be covered here.
    So, in this part of the book we want to present collected data in such a way that they can be located more readily and can be used to facilitate comparisons between different sets. Basically, there are three methods of doing this: using tables, drawing graphs and calculating derived measures. Presenting data in tabular form is relatively straightforward, so we will not cover it in this book but refer the interested reader to Section 1.1 . 1 of Egghe and Rousseau (1990).
    2.1Graphical Aspects of Data 2.1.1Mathematical Functions
    In general a graph is a two-dimensional figure with two axes at right angles to one another. An axis is a straight line with a direction, an origin and a unit of measurement (cf. Section 1.7 ). The origins of the two axes coincide. In this way, points in the plane are uniquely determined by a pair of numbers (x, y), x being the score on the horizontal axis (called the abscissa) and y being the score on the vertical axis (called the ordinate). See Fig. 2.1 .
    Fig. 2.1: Representation of pairs of numbers as points in the plane.
    In mathematics, one wants to represent many points at once. This can be done by using functions, denoted by y=f(x) (y is a function of x), meaning that f indicates the value of y that corresponds to a variable x
  • Book cover image for: SQL for Data Analytics
    eBook - ePub

    SQL for Data Analytics

    Harness the power of SQL to extract insights from data, 3rd Edition

    • Jun Shan, Matt Goldwasser, Upom Malik, Benjamin Johnston(Authors)
    • 2022(Publication Date)
    • Packt Publishing
      (Publisher)
    Raw data is a group of values that you can extract from a source. It becomes useful when it is processed to find different patterns in the data that was extracted. These patterns, also referred to as information, help you to interpret the data, make predictions, and identify unexpected changes in the future. This information is then processed into knowledge.
    Knowledge is a large, organized collection of persistent and extensive information and experience that can be used to describe and predict phenomena in the real world. Data analysis is the process by which you convert data into information and, thereafter, knowledge. Data analytics is when data analysis is combined with making predictions.
    There are several data analysis techniques available to make sense of data. One of them is statistics, which uses mathematical techniques on datasets.
    Statistics is the science of collecting and analyzing a large amount of data to identify the characteristics of the data and its subsets. For example, you may want to study the medical history of a country to identify the most common causes of illness-related fatality. You can also dive deeper into some subgroups, such as people from different geographic areas, to identify whether there are specific patterns for people from each area.
    Statistics is performed on datasets. Different data inside datasets have different characteristics and require different methods of processing. Some types of data, such as name and label, may be qualitative , which means it provides descriptive information. Others, such as counts and amounts, are quantitative , which means you can perform numerical operations, such as addition or multiplication, on these values. For example, the following dataset is a collection of some biomedical information collected across a set of patients:
    Figure 1.1: Healthcare data
    In this case, the unit of observation for the dataset is an individual patient because each row represents an individual observation, which is a unique patient. There are 10 data points, each with 5 variables. Three of the columns, Year of Birth , Height , and Number of Doctor Visits in the Year 2018 , are quantitative because they are represented by numbers. Two of the columns, Eye Color and Country of Birth
  • Book cover image for: Working With Numbers and Statistics
    eBook - ePub

    Working With Numbers and Statistics

    A Handbook for Journalists

    • Charles Livingston, Paul S. Voakes(Authors)
    • 2005(Publication Date)
    • Routledge
      (Publisher)

    chapter3Describing Data

    Data often arrive in raw form, as long lists of numbers. In this case your job is to summarize the data in a way that captures its essence and conveys its meaning. This can be done numerically, with measures such as the average and standard deviation, or graphically. At other times you find data already in summarized form; in this case you must understand what the summary is telling, and what it is not telling, and then interpret the information for your readers or viewers.
    This chapter focuses on aspects of describing data: finding the center of the data, such as an average; describing how spread out the data are, its distribution; finding relationships between sets of data; and presenting data visually.

    3.1. Averages

    Central tendency is the formal expression for the notion of where data is centered, best understood by most readers as “average.” There is no one way of measuring where data are centered, and different measures provide different insights. Here we discuss three such measures: the mean, median and mode.

    Basic Measures of the Center

    Average, Mean
    The words average and mean are synonymous.
    To compute the mean of a list of numbers, first sum the list and then divide by the number of entries.
    A survey of five gas stations shows that regular unleaded gasoline is selling at the prices of $1.39, $1.43, $1.43, $1.45, $1.70. The mean is found by taking the sum of these prices, $7.40, and dividing by 5, to get $1.48.
    Median
    The median of a list of numbers is the middle value.
    To compute the median of a list of numbers, put the list in ascending order and find the entry in the middle. If there is an even number of entries, average the middle two values.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.