Probability and Statistics for Data Science
eBook - ePub

Probability and Statistics for Data Science

Math + R + Data

Norman Matloff

Condividi libro
  1. 412 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Probability and Statistics for Data Science

Math + R + Data

Norman Matloff

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

Probability and Statistics for Data Science: Math + R + Data covers "math stat"—distributions, expected value, estimation etc.—but takes the phrase "Data Science" in the title quite seriously:

* Real datasets are used extensively.

* All data analysis is supported by R coding.

* Includes many Data Science applications, such as PCA, mixture distributions, random graph models, Hidden Markov models, linear and logistic regression, and neural networks.

* Leads the student to think critically about the "how" and "why" of statistics, and to "see the big picture."

* Not "theorem/proof"-oriented, but concepts and models are stated in a mathematically precise manner.

Prerequisites are calculus, some matrix algebra, and some experience in programming.

Norman Matloff is a professor of computer science at the University of California, Davis, and was formerly a statistics professor there. He is on the editorial boards of the Journal of Statistical Software and The R Journal. His book Statistical Regression and Classification: From Linear Models to Machine Learning was the recipient of the Ziegel Award for the best book reviewed in Technometrics in 2017. He is a recipient of his university's Distinguished Teaching Award.

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Probability and Statistics for Data Science è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Probability and Statistics for Data Science di Norman Matloff in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Economics e Statistics for Business & Economics. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Anno
2019
ISBN
9780429687112

Part I

Fundamentals of Probability

Chapter 1

Basic Probability Models

This chapter will introduce the general notions of probability. Most of it will seem intuitive to you, and intuition is indeed crucial in the field of probability and statistics. On the other hand, do not rely on intuition alone; pay careful attention to the general principles which are developed. In more complex settings intuition may not be enough, or may even mislead you. The tools discussed here will be essential, and will be cited frequently throughout the book.
In this book, we will be discussing both “classical” probability examples involving coins, cards and dice, and also examples involving applications in the real world. The latter will involve diverse fields such as data mining, machine learning, computer networks, bioinformatics, document classification, medical fields and so on. Applied problems actually require a bit more work to fully absorb, but needless to say, you will derive the most benefit from those examples rather than ones involving coins, cards and dice.1
Let’s start with one concerning transportation.

1.1 Example: Bus Ridership

Consider the following analysis of bus ridership, which (in more complex form) could be used by the bus company/agency to plan the number of buses, frequency of stops and so on. Again, in order to keep things easy, it will be quite oversimplified, but the principles will be clear.
Here is the model:
• At each stop, each passsenger alights from the bus, independently of the actions of others, with probability 0.2 each.
• Either 0, 1 or 2 new passengers get on the bus, with probabilities 0.5, 0.4 and 0.1, respectively. Passengers at successive stops act independently.
• Assume the bus is so large that it never becomes full, so the new passengers can always board.
• Suppose the bus is empty when it arrives at its first stop.
Here and throughout the book, it will be greatly helpful to first name the quantities or events involved. Let Li denote the number of passengers on the bus as it leaves its ith stop, i = 1, 2, 3,… Let Bi denote the number of new passengers who board the bus at the ith stop.
We will be interested in various probabilities, such as the probability that no passengers board the bus at the first three stops, i.e.,
P(B1=B2=B3=0)
The reader may correctly guess that the answer is 0.53 = 0.125. But again, we need to do this properly. In order to make such calculations, we must first set up ...

Indice dei contenuti