Probability, Statistics, and Data
A Fresh Approach Using R
Darrin Speegle, Bryan Clair
- 512 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Probability, Statistics, and Data
A Fresh Approach Using R
Darrin Speegle, Bryan Clair
About This Book
This book is a fresh approach to a calculus based, first course in probability and statistics, using R throughout to give a central role to data and simulation.
The book introduces probability with Monte Carlo simulation as an essential tool. Simulation makes challenging probability questions quickly accessible and easily understandable. Mathematical approaches are included, using calculus when appropriate, but are always connected to experimental computations. Using R and simulation gives a nuanced understanding of statistical inference. The impact of departure from assumptions in statistical tests is emphasized, quantified using simulations, and demonstrated with real data. The book compares parametric and non-parametric methods through simulation, allowing for a thorough investigation of testing error and power. The text builds R skills from the outset, allowing modern methods of resampling and cross validation to be introduced along with traditional statistical techniques.
Fifty-two data sets are included in the complementary R package fosdata. Most of these data sets are from recently published papers, so that you are working with current, real data, which is often large and messy. Two central chapters use powerful tidyverse tools (dplyr, ggplot2, tidyr, stringr) to wrangle data and produce meaningful visualizations. Preliminary versions of the book have been used for five semesters at Saint Louis University, and the majority of the more than 400 exercises have been classroom tested.
Frequently asked questions
Information
1Data in R
- R is widely used among statisticians, especially academic statisticians. If there is a new statistical procedure developed somewhere in academia, chances are that the code for it will be made available in R. This distinguishes R from, say, Python.
- R is commonly used for statistical analyses in many disciplines. Other software, such as SPSS or SAS is also used and in some disciplines would be the primary choice for some discipline specific courses, but R is popular and its user base is growing.
- R is free. You can install it and all optional packages on your computer at no cost. This is a big difference between R and SAS, SPSS, MATLAB, and most other statistical software.
- R has been experiencing a renaissance. With the advent of the tidyverse and RStudio, R is a vibrant and growing community. We also have found the community to be extremely welcoming. The R ecosystem is one of its strengths.
1.1 Arithmetic and variable assignment
Operator | Description | Example |
---|---|---|
+ | addition | 1 + 1 |
- | subtraction | 4 - 3 |
* | multiplication | 3 * 7 |
/ | division | 8 / 3 |
^ | exponentiation | 2^3 |
- What type of input does the function accept?
- What does the function do?
- What does the function return as output?
- What are some typical examples of how to use the function?
- type the variable name
- look in the environment box in the upper right-hand corner of RStudio.
- Use the str command. This command gives other useful information about the variable, in addition to its value.