Univariate, Bivariate, and Multivariate Statistics Using R
eBook - ePub

Univariate, Bivariate, and Multivariate Statistics Using R

Quantitative Tools for Data Analysis and Data Science

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Univariate, Bivariate, and Multivariate Statistics Using R

Quantitative Tools for Data Analysis and Data Science

About this book

A practical source for performing essential statistical analyses and data management tasks in R

Univariate, Bivariate, and Multivariate Statistics Using R offers a practical and very user-friendly introduction to the use of R software that covers a range of statistical methods featured in data analysis and data science. The author— a noted expert in quantitative teaching —has written a quick go-to reference for performing essential statistical analyses and data management tasks in R. Requiring only minimal prior knowledge, the book introduces concepts needed for an immediate yet clear understanding of statistical concepts essential to interpreting software output.

The author explores univariate, bivariate, and multivariate statistical methods, as well as select nonparametric tests. Altogether a hands-on manual on the applied statistics and essential R computing capabilities needed to write theses, dissertations, as well as research publications. The book is comprehensive in its coverage of univariate through to multivariate procedures, while serving as a friendly and gentle introduction to R software for the newcomer. This important resource:

  • Offers an introductory, concise guide to the computational tools that are useful for making sense out of data using R statistical software
  • Provides a resource for students and professionals in the social, behavioral, and natural sciences
  • Puts the emphasis on the computational tools used in the discovery of empirical patterns
  • Features a variety of popular statistical analyses and data management tasks that can be immediately and quickly applied as needed to research projects
  • Shows how to apply statistical analysis using R to data sets in order to get started quickly performing essential tasks in data analysis and data science

Written for students, professionals, and researchers primarily in the social, behavioral, and natural sciences, Univariate, Bivariate, and Multivariate Statistics Using R offers an easy-to-use guide for performing data analysis fast, with an emphasis on drawing conclusions from empirical observations. The book can also serve as a primary or secondary textbook for courses in data analysis or data science, or others in which quantitative methods are featured.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Univariate, Bivariate, and Multivariate Statistics Using R by Daniel J. Denis in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Wiley
Year
2020
Print ISBN
9781119549932
eBook ISBN
9781119549918

1
Introduction to Applied Statistics

LEARNING OBJECTIVES

  • Understand the logic of statistical inference, the purpose of statistical modeling, and where statistical inference fits in the era of ā€œBig Data.ā€
  • Understand how statistical modeling is used in scientific pursuits.
  • Understand the nature of the p‐value, the differences between p‐values and effect sizes, and why these differences are vital to understand when interpreting scientific evidence.
  • Distinguish between type I and type II errors.
  • Distinguish between point estimates and confidence intervals.
  • Understand the nature of continuous versus discrete variables.
  • Understand the ideas behind statistical power and how they relate to p‐values.
The purpose of this chapter is to give a concise introduction to the world of applied statistics as they are generally used in scientific research. We start from the beginning, and build up what you need to know to understand the rest of the book. Further readings and recommendations are provided where warranted. It is hoped that this chapter will be of value not only to the novice armed with an introductory statistics course at some point in his or her past, but also to the more experienced reader who may have ā€œgapsā€ in his or her knowledge and may find this chapter useful in unifying some principles that may have previously not been completely grasped. Thus, we launch into the book by introducing and revisiting some ā€œbig pictureā€ items in applied statistics and discussing how they relate to scientific research. Understanding these elements is crucial in being able to appreciate how statistics are applied and used in science more generally.

1.1 The Nature of Statistics and Inference

The goal of scientific research can be said to learn something about populations, whether those populations consist of people, animals, weather patterns, stars in the sky, etc. Populations can range in size from very small to extremely large, and even infinite in size. For example, consider the population of Americans, which is very large, yet not as large as the population of stars in the sky, which can be said to be practically, even if not definitively, infinite. The population of coin flips on a coin can be regarded as an infinite population. Now, we often collect subsets of these larger populations to study, which we call samples, but you should always remember that the ultimate goal of scientific investigations is usually to learn something about populations, not samples.
So if the goal is to learn about populations, why do researchers bother with collecting much smaller samples? The answer is simple. Since populations can be extremely large, it can often be very expensive or even impossible to collect data on the entire population. However, even if we could, thanks to the contributions of theoretical statistics, primarily developed in the early twentieth century with the likes of R.A. Fisher and company, it may not be necessary to study the entire population in the first place, since with inferential statistics, one can study a sample, compute statistics on that sample, then make a quality inference or ā€œeducated guessā€ about the population. The majority of scientific research conducted on samples seeks to make a generalization of the sort:
If these facts are true about my sample, what can I say about them being true about the population from which my sample was drawn?
In a nutshell, this is what inferential statistics is all about – trying to make an educated guess at population parameters based on characteristics studied in a sample. If all researchers had access to population data, statistical inference would not exist, and though many of the topics in this book would still have their place, others would definitely not have been invented.
An emoticon raising its right hand with index finger pointing upwards. The text ā€œDon’t forget!ā€ is indicated on top of the emoticon’s head.
The primary goal of most scientific research is to learn something about populations, not samples. We study samples mostly because our populations are too large or impractical to study. However, the ultimate goal is usually to learn something about populations. If most of our populations were small enough to study completely, most of the fields of inferential statistics would likely not exist.
There are essentially two kinds of statistics: descriptive and inferential. Descriptive statistics are used to, not surprisingly, describe something about a sample or a population. The nature of the description is numerical, in that a formula (i.e. some numerical measure) is being applied to some sample or population to describe a characteristic of that sample or population. The goal of inferential statistics is to obtain an estimate of a parameter using a statistic, and then to assess the goodness of that estimate. For example, we compute a sample mean, then use that estimator to estimate a population parameter. The sample mean is a descriptive statistic because it describes the sample, and also an inferential statistic because we are using it to estimate the population mean.

1.2 A Motivating Example

Understanding applied statistics is best through easy‐to‐understand research examples. As you work through the book, try to see if you can apply the concepts to research examples of your own liking and interest, as it is a very powerful way to master the concepts. So if your area of interest is studying the relationship between anxiety and depression for instance, keep asking yourself as you work through the book how the techniques presented might help you in solving problems in that area of investigation.
Suppose you have a theory that a medication is useful in reducing high blood pressure. The most ideal, yet not very pragmatic, course of action would be to give the medication to all adults suffering from high blood pressure and to assess the proportion of those taking the drug that experience an adequate reduction in blood pressure relative to a control condition. That way, you will be able to basically tap into the entire population of American adults who suffer from high blood pressure, and get an accurate assessment as to whether your medication is effective. Of course, you can't do that. Not only would you require the informed consent of all Americans to participate in your study (many would refuse to participate), but it would also be literally impossible to recruit all Americans suffering from high blood pressure to participate in your study. And such a project would be terribly impractical, expensive, and would basically take forever to complete. What is more, however, thanks to inferential statistics, you do not need to do this, and can get a pretty good estimate of the effectiveness of the medication by intelligently selecting a sample of adults with high blood pressure, treating them with the medication, and then comparing their results to one of a control group, where the control group in this case is the group not receiving treatment. This would give you a good idea of the effectiveness of the drug on the sample. Then, you can infer that result to the population.
For instance, if you find a 20% decrease in high blood pressure in your sample relative to a control group, the relevant question you are then interested in asking is:
What is the probability of finding a result like this in my sample if the true reduction in symptomology in the population is actually equal to 0?
To understand what this question means, suppose instead that you had found a 0.0000001% decrease in high blood pressure in your sample when compared to a control group. Given this sample result, would you be willing to wager that it corresponds to an effect in the population from which these data were drawn? Probably not. That is, the probability of obtaining such a small sample difference between the experimental group and the control group if the true ...

Table of contents

  1. Cover
  2. Table of Contents
  3. Preface
  4. 1 Introduction to Applied Statistics
  5. 2 Introduction to R and Computational Statistics
  6. 3 Exploring Data with R: Essential Graphics and Visualization
  7. 4 Means, Correlations, Counts: Drawing Inferences Using Easy‐to‐Implement Statistical Tests
  8. 5 Power Analysis and Sample Size Estimation Using R
  9. 6 Analysis of Variance: Fixed Effects, Random Effects, Mixed Models, and Repeated Measures
  10. 7 Simple and Multiple Linear Regression
  11. 8 Logistic Regression and the Generalized Linear Model
  12. 9 Multivariate Analysis of Variance (MANOVA) and Discriminant Analysis
  13. 10 Principal Component Analysis
  14. 11 Exploratory Factor Analysis
  15. 12 Cluster Analysis
  16. 13 Nonparametric Tests
  17. References
  18. Index
  19. End User License Agreement