Probability and Statistics for Data Science
eBook - ePub

Probability and Statistics for Data Science

Math + R + Data

Norman Matloff

Compartir libro
  1. 412 páginas
  2. English
  3. ePUB (apto para móviles)
  4. Disponible en iOS y Android
eBook - ePub

Probability and Statistics for Data Science

Math + R + Data

Norman Matloff

Detalles del libro
Vista previa del libro
Índice
Citas

Información del libro

Probability and Statistics for Data Science: Math + R + Data covers "math stat"—distributions, expected value, estimation etc.—but takes the phrase "Data Science" in the title quite seriously:

* Real datasets are used extensively.

* All data analysis is supported by R coding.

* Includes many Data Science applications, such as PCA, mixture distributions, random graph models, Hidden Markov models, linear and logistic regression, and neural networks.

* Leads the student to think critically about the "how" and "why" of statistics, and to "see the big picture."

* Not "theorem/proof"-oriented, but concepts and models are stated in a mathematically precise manner.

Prerequisites are calculus, some matrix algebra, and some experience in programming.

Norman Matloff is a professor of computer science at the University of California, Davis, and was formerly a statistics professor there. He is on the editorial boards of the Journal of Statistical Software and The R Journal. His book Statistical Regression and Classification: From Linear Models to Machine Learning was the recipient of the Ziegel Award for the best book reviewed in Technometrics in 2017. He is a recipient of his university's Distinguished Teaching Award.

Preguntas frecuentes

¿Cómo cancelo mi suscripción?
Simplemente, dirígete a la sección ajustes de la cuenta y haz clic en «Cancelar suscripción». Así de sencillo. Después de cancelar tu suscripción, esta permanecerá activa el tiempo restante que hayas pagado. Obtén más información aquí.
¿Cómo descargo los libros?
Por el momento, todos nuestros libros ePub adaptables a dispositivos móviles se pueden descargar a través de la aplicación. La mayor parte de nuestros PDF también se puede descargar y ya estamos trabajando para que el resto también sea descargable. Obtén más información aquí.
¿En qué se diferencian los planes de precios?
Ambos planes te permiten acceder por completo a la biblioteca y a todas las funciones de Perlego. Las únicas diferencias son el precio y el período de suscripción: con el plan anual ahorrarás en torno a un 30 % en comparación con 12 meses de un plan mensual.
¿Qué es Perlego?
Somos un servicio de suscripción de libros de texto en línea que te permite acceder a toda una biblioteca en línea por menos de lo que cuesta un libro al mes. Con más de un millón de libros sobre más de 1000 categorías, ¡tenemos todo lo que necesitas! Obtén más información aquí.
¿Perlego ofrece la función de texto a voz?
Busca el símbolo de lectura en voz alta en tu próximo libro para ver si puedes escucharlo. La herramienta de lectura en voz alta lee el texto en voz alta por ti, resaltando el texto a medida que se lee. Puedes pausarla, acelerarla y ralentizarla. Obtén más información aquí.
¿Es Probability and Statistics for Data Science un PDF/ePUB en línea?
Sí, puedes acceder a Probability and Statistics for Data Science de Norman Matloff en formato PDF o ePUB, así como a otros libros populares de Economics y Statistics for Business & Economics. Tenemos más de un millón de libros disponibles en nuestro catálogo para que explores.

Información

Año
2019
ISBN
9780429687112

Part I

Fundamentals of Probability

Chapter 1

Basic Probability Models

This chapter will introduce the general notions of probability. Most of it will seem intuitive to you, and intuition is indeed crucial in the field of probability and statistics. On the other hand, do not rely on intuition alone; pay careful attention to the general principles which are developed. In more complex settings intuition may not be enough, or may even mislead you. The tools discussed here will be essential, and will be cited frequently throughout the book.
In this book, we will be discussing both “classical” probability examples involving coins, cards and dice, and also examples involving applications in the real world. The latter will involve diverse fields such as data mining, machine learning, computer networks, bioinformatics, document classification, medical fields and so on. Applied problems actually require a bit more work to fully absorb, but needless to say, you will derive the most benefit from those examples rather than ones involving coins, cards and dice.1
Let’s start with one concerning transportation.

1.1 Example: Bus Ridership

Consider the following analysis of bus ridership, which (in more complex form) could be used by the bus company/agency to plan the number of buses, frequency of stops and so on. Again, in order to keep things easy, it will be quite oversimplified, but the principles will be clear.
Here is the model:
• At each stop, each passsenger alights from the bus, independently of the actions of others, with probability 0.2 each.
• Either 0, 1 or 2 new passengers get on the bus, with probabilities 0.5, 0.4 and 0.1, respectively. Passengers at successive stops act independently.
• Assume the bus is so large that it never becomes full, so the new passengers can always board.
• Suppose the bus is empty when it arrives at its first stop.
Here and throughout the book, it will be greatly helpful to first name the quantities or events involved. Let Li denote the number of passengers on the bus as it leaves its ith stop, i = 1, 2, 3,… Let Bi denote the number of new passengers who board the bus at the ith stop.
We will be interested in various probabilities, such as the probability that no passengers board the bus at the first three stops, i.e.,
P(B1=B2=B3=0)
The reader may correctly guess that the answer is 0.53 = 0.125. But again, we need to do this properly. In order to make such calculations, we must first set up ...

Índice