Probability, Statistics, and Data
eBook - ePub

Probability, Statistics, and Data

A Fresh Approach Using R

Darrin Speegle, Bryan Clair

Buch teilen
  1. 512 Seiten
  2. English
  3. ePUB (handyfreundlich)
  4. Über iOS und Android verfügbar
eBook - ePub

Probability, Statistics, and Data

A Fresh Approach Using R

Darrin Speegle, Bryan Clair

Angaben zum Buch
Buchvorschau
Inhaltsverzeichnis
Quellenangaben

Über dieses Buch

This book is a fresh approach to a calculus based, first course in probability and statistics, using R throughout to give a central role to data and simulation.

The book introduces probability with Monte Carlo simulation as an essential tool. Simulation makes challenging probability questions quickly accessible and easily understandable. Mathematical approaches are included, using calculus when appropriate, but are always connected to experimental computations. Using R and simulation gives a nuanced understanding of statistical inference. The impact of departure from assumptions in statistical tests is emphasized, quantified using simulations, and demonstrated with real data. The book compares parametric and non-parametric methods through simulation, allowing for a thorough investigation of testing error and power. The text builds R skills from the outset, allowing modern methods of resampling and cross validation to be introduced along with traditional statistical techniques.

Fifty-two data sets are included in the complementary R package fosdata. Most of these data sets are from recently published papers, so that you are working with current, real data, which is often large and messy. Two central chapters use powerful tidyverse tools (dplyr, ggplot2, tidyr, stringr) to wrangle data and produce meaningful visualizations. Preliminary versions of the book have been used for five semesters at Saint Louis University, and the majority of the more than 400 exercises have been classroom tested.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?
Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.
(Wie) Kann ich Bücher herunterladen?
Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.
Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?
Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.
Was ist Perlego?
Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.
Unterstützt Perlego Text-zu-Sprache?
Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.
Ist Probability, Statistics, and Data als Online-PDF/ePub verfügbar?
Ja, du hast Zugang zu Probability, Statistics, and Data von Darrin Speegle, Bryan Clair im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Volkswirtschaftslehre & Statistik für Volks- & Betriebswirtschaft. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

1Data in R

DOI: 10.1201/9781003004899-1
The R Statistical Programming Language plays a central role in this book. While there are several other programming languages and software packages that do similar things, we chose R for several reasons:
  1. R is widely used among statisticians, especially academic statisticians. If there is a new statistical procedure developed somewhere in academia, chances are that the code for it will be made available in R. This distinguishes R from, say, Python.
  2. R is commonly used for statistical analyses in many disciplines. Other software, such as SPSS or SAS is also used and in some disciplines would be the primary choice for some discipline specific courses, but R is popular and its user base is growing.
  3. R is free. You can install it and all optional packages on your computer at no cost. This is a big difference between R and SAS, SPSS, MATLAB, and most other statistical software.
  4. R has been experiencing a renaissance. With the advent of the tidyverse and RStudio, R is a vibrant and growing community. We also have found the community to be extremely welcoming. The R ecosystem is one of its strengths.
In this chapter, we will begin to see some of the capabilities of R. We point out that R is a fully functional programming language, as well as being a statistical software package. We will only touch on the nuances of R as a programming language in this book.

1.1 Arithmetic and variable assignment

We begin by showing how R can be used as a calculator. Here is a table of commonly used arithmetic operators.
TABLE 1.1: Basic arithmetic operators in R.
Operator
Description
Example
+
addition
1 + 1
-
subtraction
4 - 3
*
multiplication
3 * 7
/
division
8 / 3
^
exponentiation
2^3
The output of the examples in Table 1.1 is given below. Throughout the book, lines that start with ## indicate output from R commands. These will not show up when you type in the commands yourself. The [1] in the lines below indicate that there is one piece of output from the command. These will show up when you type in the commands.
A couple of useful constants in R are pi and exp(1), which are π3.141593 and e2.718282. Here R a couple of examples of how you can use them.
R is a functional programming language. If you don't know what that means, that's OK, but as you might guess from the name, functions play a large role in R. We will see many, many functions throughout the book. Every time you see a new function, think about the following four questions:
  1. What type of input does the function accept?
  2. What does the function do?
  3. What does the function return as output?
  4. What are some typical examples of how to use the function?
In this section, we focus on functions that do things that you are likely already familiar with from your previous math courses.
We start with exp. The function exp takes one argument named x and returns ex. So, for example, exp(x = 1) will compute e1=e, as we saw above. In R, it is optional as to whether you supply the named version x = 1 or just 1 as the argument. So, it is equivalent to write exp(x = 1) or exp(1). Typically, for functions that are “well-known,” the first argument or two will be given without names, then the rest will be provided with their names. Our advice is that if in doubt, include the name.
Next, we discuss the log function. The function log takes two arguments x and base and returns logbx, where b is the base. The x argument is required. The base argument is optional with a default value of e. In other words, the default logarithm is the natural logarithm. Here are some examples of using exp and log.
You can't get very far without storing results of your computations to variables! The way1 to do so is with the arrow < -. Typing Alt + - is the keyboard shortcut for < -.
The # in inches part of the code above is a comment. These are provided to give the reader information about what is going on in the R code, but are not executed and have no impact on the output.
If you want to see what value is stored in a variable, you can
  1. type the variable name
  2. look in the environment box in the upper right-hand corner of RStudio.
  3. Use the str command. This command gives other useful information about the variable, in addition to its value.
This says that height contains num-eric data, and its current value is 192 (which is ...

Inhaltsverzeichnis