
Machine Learning with Spark and Python
Essential Techniques for Predictive Analytics
- English
- ePUB (mobile friendly)
- Available on iOS & Android
About this book
Machine Learning with Spark and Python Essential Techniques for Predictive Analytics, Second Edition simplifies ML for practical uses by focusing on two key algorithms. This new second edition improves with the addition of Spark—a ML framework from the Apache foundation. By implementing Spark, machine learning students can easily process much large data sets and call the spark algorithms using ordinary Python code. Machine Learning with Spark and Python focuses on two algorithm families (linear methods and ensemble methods) that effectively predict outcomes. This type of problem covers many use cases such as what ad to place on a web page, predicting prices in securities markets, or detecting credit card fraud. The focus on two families gives enough room for full descriptions of the mechanisms at work in the algorithms. Then the code examples serve to illustrate the workings of the machinery with specific hackable code.
Frequently asked questions
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Information
CHAPTER 1
The Two Essential Algorithms for Making Predictions
Why Are These Two Algorithms So Useful?
- “An Empirical Comparison of Supervised Learning Algorithms,” by Rich Caruana and Alexandru Niculescu-Mizil1
- “An Empirical Evaluation of Supervised Learning in High Dimensions,” by Rich Caruana, Nikos Karampatziakis, and Ainur Yessenalina2
| DATA SET NAME | NUMBER OF ATTRIBUTES | % OF EXAMPLES THAT ARE POSITIVE |
| Adult | 14 | 25 |
| Bact | 11 | 69 |
| Cod | 15 | 50 |
| Calhous | 9 | 52 |
| Cov_Type | 54 | 36 |
| HS | 200 | 24 |
| Letter.p1 | 16 | 3 |
| Letter.p2 | 16 | 53 |
| Medis | 63 | 11 |
| Mg | 124 | 17 |
| Slac | 59 | 50 |
Table of contents
- Cover
- Table of Contents
- Introduction
- CHAPTER 1: The Two Essential Algorithms for Making Predictions
- CHAPTER 2: Understand the Problem by Understanding the Data
- CHAPTER 3: Predictive Model Building: Balancing Performance, Complexity, and Big Data
- CHAPTER 4: Penalized Linear Regression
- CHAPTER 5: Building Predictive Models Using Penalized Linear Methods
- CHAPTER 6: Ensemble Methods
- CHAPTER 7: Building Ensemble Models with Python
- Index
- End User License Agreement