Probability, Statistics, and Data
eBook - ePub

Probability, Statistics, and Data

A Fresh Approach Using R

Darrin Speegle, Bryan Clair

Partager le livre
  1. 512 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Probability, Statistics, and Data

A Fresh Approach Using R

Darrin Speegle, Bryan Clair

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

This book is a fresh approach to a calculus based, first course in probability and statistics, using R throughout to give a central role to data and simulation.

The book introduces probability with Monte Carlo simulation as an essential tool. Simulation makes challenging probability questions quickly accessible and easily understandable. Mathematical approaches are included, using calculus when appropriate, but are always connected to experimental computations. Using R and simulation gives a nuanced understanding of statistical inference. The impact of departure from assumptions in statistical tests is emphasized, quantified using simulations, and demonstrated with real data. The book compares parametric and non-parametric methods through simulation, allowing for a thorough investigation of testing error and power. The text builds R skills from the outset, allowing modern methods of resampling and cross validation to be introduced along with traditional statistical techniques.

Fifty-two data sets are included in the complementary R package fosdata. Most of these data sets are from recently published papers, so that you are working with current, real data, which is often large and messy. Two central chapters use powerful tidyverse tools (dplyr, ggplot2, tidyr, stringr) to wrangle data and produce meaningful visualizations. Preliminary versions of the book have been used for five semesters at Saint Louis University, and the majority of the more than 400 exercises have been classroom tested.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Probability, Statistics, and Data est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Probability, Statistics, and Data par Darrin Speegle, Bryan Clair en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Volkswirtschaftslehre et Statistik fĂŒr Volks- & Betriebswirtschaft. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2021
ISBN
9781000504514

1Data in R

DOI: 10.1201/9781003004899-1
The R Statistical Programming Language plays a central role in this book. While there are several other programming languages and software packages that do similar things, we chose R for several reasons:
  1. R is widely used among statisticians, especially academic statisticians. If there is a new statistical procedure developed somewhere in academia, chances are that the code for it will be made available in R. This distinguishes R from, say, Python.
  2. R is commonly used for statistical analyses in many disciplines. Other software, such as SPSS or SAS is also used and in some disciplines would be the primary choice for some discipline specific courses, but R is popular and its user base is growing.
  3. R is free. You can install it and all optional packages on your computer at no cost. This is a big difference between R and SAS, SPSS, MATLAB, and most other statistical software.
  4. R has been experiencing a renaissance. With the advent of the tidyverse and RStudio, R is a vibrant and growing community. We also have found the community to be extremely welcoming. The R ecosystem is one of its strengths.
In this chapter, we will begin to see some of the capabilities of R. We point out that R is a fully functional programming language, as well as being a statistical software package. We will only touch on the nuances of R as a programming language in this book.

1.1 Arithmetic and variable assignment

We begin by showing how R can be used as a calculator. Here is a table of commonly used arithmetic operators.
TABLE 1.1: Basic arithmetic operators in R.
Operator
Description
Example
+
addition
1 + 1
-
subtraction
4 - 3
*
multiplication
3 * 7
/
division
8 / 3
^
exponentiation
2^3
The output of the examples in Table 1.1 is given below. Throughout the book, lines that start with ## indicate output from R commands. These will not show up when you type in the commands yourself. The [1] in the lines below indicate that there is one piece of output from the command. These will show up when you type in the commands.
A couple of useful constants in R are pi and exp(1), which are π≈3.141593 and e≈2.718282. Here R a couple of examples of how you can use them.
R is a functional programming language. If you don't know what that means, that's OK, but as you might guess from the name, functions play a large role in R. We will see many, many functions throughout the book. Every time you see a new function, think about the following four questions:
  1. What type of input does the function accept?
  2. What does the function do?
  3. What does the function return as output?
  4. What are some typical examples of how to use the function?
In this section, we focus on functions that do things that you are likely already familiar with from your previous math courses.
We start with exp. The function exp takes one argument named x and returns ex. So, for example, exp(x = 1) will compute e1=e, as we saw above. In R, it is optional as to whether you supply the named version x = 1 or just 1 as the argument. So, it is equivalent to write exp(x = 1) or exp(1). Typically, for functions that are “well-known,” the first argument or two will be given without names, then the rest will be provided with their names. Our advice is that if in doubt, include the name.
Next, we discuss the log function. The function log takes two arguments x and base and returns log⁥bx, where b is the base. The x argument is required. The base argument is optional with a default value of e. In other words, the default logarithm is the natural logarithm. Here are some examples of using exp and log.
You can't get very far without storing results of your computations to variables! The way1 to do so is with the arrow < -. Typing Alt + - is the keyboard shortcut for < -.
The # in inches part of the code above is a comment. These are provided to give the reader information about what is going on in the R code, but are not executed and have no impact on the output.
If you want to see what value is stored in a variable, you can
  1. type the variable name
  2. look in the environment box in the upper right-hand corner of RStudio.
  3. Use the str command. This command gives other useful information about the variable, in addition to its value.
This says that height contains num-eric data, and its current value is 192 (which is ...

Table des matiĂšres