An Introduction to Data Analysis
eBook - ePub

An Introduction to Data Analysis

Quantitative, Qualitative and Mixed Methods

  1. 296 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

An Introduction to Data Analysis

Quantitative, Qualitative and Mixed Methods

About this book

Covering the general process of data analysis to finding, collecting, organizing, and presenting data, this book offers a complete introduction to the fundamentals of data analysis.

Using real-world case studies as illustrations, it helps readers understand theories behind and develop techniques for conducting quantitative, qualitative, and mixed methods data analysis. With an easy-to-follow organization and clear, jargon-free language, it helps readers not only become proficient data analysts, but also develop the critical thinking skills necessary to assess analyses presented by others in both academic research and the popular media.

It includes advice on:

- Data analysis frameworks

- Validity and credibility of data

- Sampling techniques  

- Data management

- The big data phenomenon  

- Data visualisation

- Effective data communication

Whether you are new to data analysis or looking for a quick-reference guide to key principles of the process, this book will help you uncover nuances, complexities, patterns, and relationships among all types of data.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access An Introduction to Data Analysis by Tiffany Bergin in PDF and/or ePUB format, as well as other popular books in Social Sciences & Social Science Research & Methodology. We have over one million books available in our catalogue for you to explore.

1 Introducing Data

contents

  • 1.1 Chapter Overview 2
  • 1.2 Data Surrounds Us 2
  • 1.3 The Power of Data: Fog, Pollution, and Catastrophe in London 2
  • 1.4 The Lingering Influence of Data: The Work of Alexis de Tocqueville 3
  • 1.5 The Questions that Drive Data Analysis: The Work of Adolphe Quetelet 4
  • 1.6 Defining ‘Data’ 8
  • 1.7 From ‘Data’ to ‘Big Data’ 10
  • 1.8 Concluding Thoughts 11
  • 1.9 Summary 12
  • 1.10 Further Reading 12
  • 1.11 Discussion Questions 13

1.1 Chapter Overview

By the end of this chapter, you will be able to:
  • Describe the impact and lingering influence of data through historical examples
  • Gain a first-hand perspective on what data analysis involves by making your own predictions and comparing these predictions with real-life data
  • Define the term ‘data’
  • Describe the ‘big data’ phenomenon and its potential advantages and risks.

1.2 Data Surrounds Us

Data surrounds us. From stock prices on the morning news to records of calories burned on fitness machines, we all encounter colossal amounts of data – quantitative or qualitative information about ourselves, society, or the universe – every day. Analysing all of this data would be impossible; analysing some of it, however, can tell us a great deal about ourselves, our society, and the world we live in.
Data allows us to make discoveries that intuition or common sense cannot uncover. Data drives policy changes by governments. Data shapes the behaviour of companies and organizations. Data even changes the way people think. Given the power of data, the ability to analyse data is a special skill that is increasingly valued in society. However, because this skill is so potent, it can – like any type of power – be used for good or evil. While many data analysts strive to do good by accurately and clearly presenting their findings, others deliberately misrepresent data for selfish purposes. The goal of this book is to develop your data analysis skills to help you do good things in the world, and recognize when other data analysts are deliberately misrepresenting their findings.

1.3 The Power of Data: Fog, Pollution, and Catastrophe in London

Not yet convinced of the power of data? Let’s look at one example of how data, painstakingly collected and clearly presented to the public, saved countless lives and permanently altered life in one of the world’s great cities.
Imagine you are living in London in December 1952. Pollution emitted by the city’s smokestacks and coal-burning fireplaces mixes with the winter fog, resulting in a thick, choking, and nearly immobilizing smog (MacNee & Donaldson, 2008, p. 121). The smog lasts for five days. Visibility becomes so poor that buses stop running and, even indoors, many theatrical events are cancelled because audiences cannot see the stage (BBC News, 2008; Wallace & Hobbs, 2006, p. 179). Even scarier is the fog’s impact on human health. Between 5 December and 9 December 1952, overall hospital admissions increase by half, with many of these admissions due to respiratory conditions (MacNee & Donaldson, 2008, p. 121).
Although the December 1952 fog was particularly severe, similar air pollution events were alarmingly common in nineteenth- and early twentieth-century London. Several past smog incidents had killed hundreds of people in the late 1800s (Brimblecombe, 1987, p. 124). Yet these past tragedies did not prompt policy changes. Although some observers expressed concern about potential health consequences – in the late nineteenth century, for example, one meteorologist argued that elevated levels of bronchitis in London were partly due to smoky fogs (MacNee & Donaldson, 2008, p. 121; Russell, 1889) – data was not yet generally available to analyse this issue. However, by 1952, this had changed. The ability to accurately collect, analyse, and communicate findings from data about the fog’s consequences changed everything.
For the 1952 fog, the first set of influential data was released by the health minister several weeks after the tragedy, revealing that 2800 extra deaths occurred in London during the fog as compared with the same week in 1951 (Thorsheim, 2006, p. 165).1 This dramatic statistic received significant media attention, both in Britain and around the world (‘Week of London fog…’, 1952). It encouraged people to view the 1952 fog as a preventable tragedy that could be stopped through policy action on pollution (Thorsheim, 2004, p. 166). Public concern about the 1952 fog paved the way for adoption of the Clean Air Act of 1956 which regulated industrial emissions and mandated that most homes and businesses stop using coal fires – a massive societal change that did not occur in response to previous fogs (Thorsheim, 2006, pp. 173–174).
1 Other reports have suggested that some 4000 people died due to fog-related respiratory difficulties (Wallace & Hobbs, 2006, p. 179), while a more comprehensive recent analysis found that as many as 12,000 additional deaths between December 1952 and February 1953 were related to the fog (Bell & Davis, 2001).
This example illustrates the power of data to sharply alter people’s perceptions – and even to change policy. Such policy change only occurred after the public was given clear proof of the number of people who had been killed in the fog. In other words, accurately collected and clearly presented data was necessary to prompt this transformation.

1.4 The Lingering Influence of Data: The Work of Alexis de Tocqueville

The data that helped prompt action on air pollution was quantitative data – meaning that it consisted of numbers (in this case, a single alarming statistic about the number of people who had died). When we hear the word ‘data’, we often think about statistics like these. But data does not have to consist of numbers; it can, instead, consist of words, actions, behaviours, images, objects, and numerous other features of individual or social life. Data that is not numeric is known as qualitative data and, as with quantitative data, the analysis of qualitative data has similarly shaped policy and society in innumerable ways. Such influence can in fact resonate centuries into the future.
Let’s look at one such example of the lingering influence of data: the landmark qualitative research of Alexis de Tocqueville. In 1831, de Tocqueville and his colleague Gustave Beaumont undertook a thorough nine-month journey throughout the Eastern, Midwest, and Southern regions of the United States. Although tasked by the French government with studying the US prison system, de Tocqueville expanded the scope of his analysis to include all elements of social and political life in the young country. The lengthy, groundbreaking work that de Tocqueville produced, Democracy in America (1835), offers a multilayered analysis of democracy (and the appalling contradiction of slavery within a democracy), class, gender, cultural attitudes, and political values (Kurweil, 1999).
Although de Tocqueville’s work is often considered a work of political philosophy, it’s actually an ‘example of productive qualitative inquiry’, since, to produce the volume, de Tocqueville meticulously collected and analysed qualitative data about the attitudes of American citizens and the workings of their democratic government (Lingenfelter, 2016, Chapter 5). Specifically, de Tocqueville’s work is an early example of ‘participant observation’ (Whitley, 2008, p. 98; see also Handler, 2005, p. 22; Kurweil, 1999, p. 153), a qualitative research method in which a researcher conducts extensive research in the field, and participates in activities or daily life routines within a particular area, organization, society, or group of interest. Participant observation allows a researcher to obtain first-hand insight into what it feels like to be a member of a specific group and engage in particular activities. De Tocqueville achieved such insight by travelling throughout the country’s regions, conversing with numerous Americans, and participating broadly in community life. As an ‘outsider’, de Tocqueville was able to bring a uniquely observant perspective to the society he studied, and he is now recognized by some contemporary scholars as ‘the first modern social scientist’ (Kurweil, 1999, p. 153).
Perhaps the clearest illustration of the value of de Tocqueville’s work is its lingering influence today. Democracy in America is still regularly referenced by cultural commentators and is widely assigned to social science students in US universities. A 2015 article in the Washington Post, for example, described de Tocqueville’s work as ‘[t]he book every new American citizen – and every old one, too – should read’ (Lozada, 2015); and a 2017 BBC News article entitled ‘Can democracy survive Facebook?’ referenced de Tocqueville’s work (Rajan, 2017). The fact that de Tocqueville’s findings are still considered relevant to a twenty-first-century discussion about social media and democracy illustrates that well-executed qualitative research – like quantitative research – can exert a long-lasting impact. If you’re interested in learning more about participant observation and other forms of qualitative research, you can look forward to Chapter 6, where we’ll discuss qualitative data analysis in detail.

1.5 The Questions that Drive Data Analysis: The Work of Adolphe Quetelet

Data analysis is driven by questions. Learning to ask interesting and thoughtful questions is the first step in conducting interesting and thoughtful social research. One of this book’s goals is to inspire you to ask such questions about the world, and in each chapter we’ll explore real-life examples of the kinds of fascinating questions data analysts have asked. In fact, let’s start with one example now: the work of Lambert Adolphe Jacques (or simply Adolphe) Quetelet (1796–1874). Quetelet’s pioneering quantitative research illustrates how well-thought-out questions can prompt innovative data analysis, and lead to surprising and fundamental insights about the social world. Quetelet, a brilliant early data analyst from Belgium, produced groundbreaking findings in a wide range of disciplines, including medicine and criminology. We’ll focus on Quetelet’s work in two controversial areas: human height and weight, and weather and crime.

1.5.1 Height and weight

How do height and weight change over the course of individuals’ lives? Do gender, class, and geographic region affect such changes? These questions, which reverberate in today’s discussions about rising obesity levels, also fascinated Quetelet in the 1840s. Yet in contrast to more speculative commentators, Quetelet analysed quantitative data to answer these questions rigorously. Quetelet’s (1842) data was derived from a broad range of sources – including government registers in Belgium, measurements taken from infants at the Foundling Hospital in Brussels, measurements taken from children working in factories in Manchester and Stockport (in England), and measurements taken from undergraduates at the University of Cambridge.
Quetelet (1842) noted that wealthier individuals tended to be taller than average, and that the growth of poorer individuals was often stunted by poverty and deprivation.2 He also observed that, for the average person, body weight measured in kilograms tended to be proportional to the square of their height measured in meters – an observation that remains deeply influential (Eknoyan, 2008, p. 48) as it helped form the basis of the Body Mass Index (BMI; Keys, Karvonen, Kimura, & Taylor, 1972). Today, the BMI is one of the most widely used measures for assessing whether an individual is underweight, at a healthy weight, overweight, or obese, and since obesity is now a significant global health concern, the BMI currently receives substantial attention from health professionals and the popular media. For example, the US Centers for Disease Control and Prevention (CDC, 2012) note that: ‘For adults, overweight and obesity ranges are determined by using weight and height to calculate a number called the “body mass index”’. The BMI is also used to trace global trends in obesity, revealing that, from 1980 to 2013, the percentage of men around the world who were obese rose from 19% to 37%, while the percentage of women rose from 30% to 38% (Ng et al., 2014).
2 Although Quetelet’s (1842) overall work was based on the range of data sources already described, he used a more limited dataset to derive these specific findings. This dataset only included data from Brussels and the Belgian province of Brabant; therefore, Quetelet cautioned that his findings were only applicable to these areas, and were not guaranteed to hold true in other regions. Such explicit acknowledgement of the limitations of one’s findings is essential for any data analyst.
The BMI has many limitations. Since it does not distinguish between fat and muscle, it may not be helpful for all individuals (such as athletes with substantial muscle mass), but it can be useful at the population level, as the example of global obesity trends illustrates (see the discussion in Stephenson, 2013). Although more sophisticated methods for measuring obesity now exist, the simplicity and ease with which the BMI can be calculated likely reinforce its appeal. Additionally, as Wells (2014) has described, it is not clear whether some alternative measurements, such as waist size, are actually better indicators of the risk of developing chronic diseases than the BMI.
The BMI’s persistence and continuing value illustrate the potential impact of thoughtful and considered data analysis. Quetelet’s observation about human height and weight that inspired the BMI was not based on speculation or intuition; as we have seen, Quetelet’s (1842) observation was based on his considered analysis of data of the heights and weights of many different individuals. A more speculative observation would likely not have been as accurate or achieved the precision necessary to survive into the twenty-first century, as the BMI has done. Additionally, the BMI’s persistence also highlights the importance of data presentation. Quetelet’s (1842) original observation is just a simple formula, accessible to anyone, and this simplicity has likely helped the BMI to persist despite the availability of more sophisticated techniques for measuring obesity today.

1.5.2 Crime and weather

In addition to height and weight, Quetelet explored the contentious question of whether the weather can affect crime rates. This question has continued to interest scholars, and Quetelet...

Table of contents

  1. Cover
  2. Half Title
  3. Publisher Note
  4. Title Page
  5. Copyright Page
  6. Contents
  7. Illustration List
  8. Illustration List
  9. Preface
  10. Acknowledgements
  11. About the Author
  12. 1 Introducing Data
  13. 2 Thinking like a Data Analyst
  14. 3 Finding, Collecting, and Organizing Data
  15. 4 Introducing Quantitative Data Analysis
  16. 5 Applying Quantitative Data Analysis: Correlations, t-Tests, and Chi-Square Tests
  17. 6 Introducing Qualitative Data Analysis
  18. 7 Applying Qualitative Data Analysis
  19. 8 Introducing Mixed Methods: How to Synthesize Quantitative and Qualitative Data Analysis Techniques
  20. 9 Communicating Findings and Visualizing Data
  21. 10 Conclusion: Becoming a Data Analyst
  22. Glossary
  23. References
  24. Index