Hands-On Predictive Analytics with Python
Master the complete predictive analytics process, from problem definition to model deployment
Alvaro Fuentes
- 330 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Hands-On Predictive Analytics with Python
Master the complete predictive analytics process, from problem definition to model deployment
Alvaro Fuentes
About This Book
Step-by-step guide to build high performing predictive applications
Key Features
- Use the Python data analytics ecosystem to implement end-to-end predictive analytics projects
- Explore advanced predictive modeling algorithms with an emphasis on theory with intuitive explanations
- Learn to deploy a predictive model's results as an interactive application
Book Description
Predictive analytics is an applied field that employs a variety of quantitative methods using data to make predictions. It involves much more than just throwing data onto a computer to build a model. This book provides practical coverage to help you understand the most important concepts of predictive analytics. Using practical, step-by-step examples, we build predictive analytics solutions while using cutting-edge Python tools and packages.
The book's step-by-step approach starts by defining the problem and moves on to identifying relevant data. We will also be performing data preparation, exploring and visualizing relationships, building models, tuning, evaluating, and deploying model.
Each stage has relevant practical examples and efficient Python code. You will work with models such as KNN, Random Forests, and neural networks using the most important libraries in Python's data science stack: NumPy, Pandas, Matplotlib, Seaborn, Keras, Dash, and so on. In addition to hands-on code examples, you will find intuitive explanations of the inner workings of the main techniques and algorithms used in predictive analytics.
By the end of this book, you will be all set to build high-performance predictive analytics solutions using Python programming.
What you will learn
- Get to grips with the main concepts and principles of predictive analytics
- Learn about the stages involved in producing complete predictive analytics solutions
- Understand how to define a problem, propose a solution, and prepare a dataset
- Use visualizations to explore relationships and gain insights into the dataset
- Learn to build regression and classification models using scikit-learn
- Use Keras to build powerful neural network models that produce accurate predictions
- Learn to serve a model's predictions as a web application
Who this book is for
This book is for data analysts, data scientists, data engineers, and Python developers who want to learn about predictive modeling and would like to implement predictive analytics solutions using Python's data stack. People from other backgrounds who would like to enter this exciting field will greatly benefit from reading this book. All you need is to be proficient in Python programming and have a basic understanding of statistics and college-level algebra.
Frequently asked questions
Information
Predicting Categories with Machine Learning
- Learn about classification tasks and why classification models are so important
- Review the credit card default dataset
- Learn about the logistic regression model
- Understand the classification trees model
- Learn the random forest model
- Provide a simple example of multiclass classification
- Learn the basics of Naive Bayes classifiers
Technical requirements
- Python 3.6 or higher
- Jupyter Notebook
- Recent versions of the following Python libraries: NumPy, pandas, matplotlib, Seaborn, and scikit-learn
Classification tasks
- Direct marketing: Predict whether a customer will give a positive or a negative response to a campaign
- Medicine: Predict whether a patient is healthy or is sick; or, for example, which kind of cancer the patient has
- Insurance: Classify clients by risk level; for instance, low, average, or high risk
- Telecommunication and other industries: Churn models are classification models that predict which customers will switch to another provider
- Education: Predict which students will drop out from a program
- Email services: Classify emails that go to different places such as inbox, spam, social, and promotions
- Binary classification: The target has only two categories, which is the case for our credit card default problem.
- Multiclass classification: When the target has more than two classes.
- Multilabel classification: The problem of assigning more than one category or label to an observation. A popular example could predict the subject of a news article based on its contents. Many news articles hardly fall into just one category; one article could be simultaneously about the broad topics of World News, Politics, and Finance.
Predicting categories and probabilities
- Predicted classes: For every observation, the model will directly give the prediction of the class.
- Probabilities for each class: For every observation and every class, the model will output probabilities of that observation belonging to that class. Say, for example, we have three classesâA, B, and Câthen the output of the model would be a triple of numbers such as [0.2, 0.7, 0.1], meaning the probabilities of the observation belonging to A, B, and C respectively. Note that, since we are dealing with probabilities, the values should add up to 1.
Credit card default dataset
- SEX: Gender (1 = male; 2 = female).
- EDUCATION: Education (1 = graduate school; 2 = university; 3 = high school; 4 = others).
- MARRIAGE: Marital status (1 = married; 2 = single; 3 = others).
- AGE: Age (year).
- LIMIT_BAL: Amount of the given credit (New Taiwan dollar)âit includes both the individual consumer credit and his/her family (supplementary) credit.
- PAY_1 - PAY_6: History of past payment. We tracked the past monthly payment records (from April, 2005, to September, 2005) as follows: 0 = the repayment status in September, 2005; 1 = the repayment status in August, 2005; . . .; 6 = the repayment status in April, 2005. The measurement scale for the repayment status is: -1 = pay duly; 1 = payment delay for one month; 2 = payment delay for two months; . . .; 8 = payment delay for eight months; 9 = payment delay for nine months and ...