Machine Learning with R
eBook - ePub

Machine Learning with R

Expert techniques for predictive modeling, 3rd Edition

  1. 458 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Machine Learning with R

Expert techniques for predictive modeling, 3rd Edition

About this book

Solve real-world data problems with R and machine learningKey Featuresβ€’ Third edition of the bestselling, widely acclaimed R machine learning book, updated and improved for R 3.6 and beyondβ€’ Harness the power of R to build flexible, effective, and transparent machine learning modelsβ€’ Learn quickly with a clear, hands-on guide by experienced machine learning teacher and practitioner, Brett LantzBook DescriptionMachine learning, at its core, is concerned with transforming data into actionable knowledge. R offers a powerful set of machine learning methods to quickly and easily gain insight from your data. Machine Learning with R, Third Edition provides a hands-on, readable guide to applying machine learning to real-world problems. Whether you are an experienced R user or new to the language, Brett Lantz teaches you everything you need to uncover key insights, make new predictions, and visualize your findings. This new 3rd edition updates the classic R data science book to R 3.6 with newer and better libraries, advice on ethical and bias issues in machine learning, and an introduction to deep learning. Find powerful new insights in your data; discover machine learning with R.What you will learnβ€’ Discover the origins of machine learning and how exactly a computer learns by exampleβ€’ Prepare your data for machine learning work with the R programming languageβ€’ Classify important outcomes using nearest neighbor and Bayesian methodsβ€’ Predict future events using decision trees, rules, and support vector machinesβ€’ Forecast numeric data and estimate financial values using regression methodsβ€’ Model complex processes with artificial neural networks β€” the basis of deep learningβ€’ Avoid bias in machine learning modelsβ€’ Evaluate your models and improve their performanceβ€’ Connect R to SQL databases and emerging big data technologies such as Spark, H2O, and TensorFlowWho this book is forData scientists, students, and other practitioners who want a clear, accessible guide to machine learning with R.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere β€” even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Machine Learning with R by Brett Lantz in PDF and/or ePUB format, as well as other popular books in Computer Science & Artificial Intelligence (AI) & Semantics. We have over one million books available in our catalogue for you to explore.

Machine Learning with R - Third Edition


Table of Contents

Machine Learning with R - Third Edition
Why subscribe?
Packt.com
Contributors
About the authors
About the reviewer
Preface
Who this book is for
What this book covers
What you need for this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
1. Introducing Machine Learning
The origins of machine learning
Uses and abuses of machine learning
Machine learning successes
The limits of machine learning
Machine learning ethics
How machines learn
Data storage
Abstraction
Generalization
Evaluation
Machine learning in practice
Types of input data
Types of machine learning algorithms
Matching input data to algorithms
Machine learning with R
Installing R packages
Loading and unloading R packages
Installing RStudio
Summary
2. Managing and Understanding Data
R data structures
Vectors
Factors
Lists
Data frames
Matrices and arrays
Managing data with R
Saving, loading, and removing R data structures
Importing and saving data from CSV files
Exploring and understanding data
Exploring the structure of data
Exploring numeric variables
Measuring the central tendency – mean and median
Measuring spread – quartiles and the five-number summary
Visualizing numeric variables – boxplots
Visualizing numeric variables – histograms
Understanding numeric data – uniform and normal distributions
Measuring spread – variance and standard deviation
Exploring categorical variables
Measuring the central tendency – the mode
Exploring relationships between variables
Visualizing relationships – scatterplots
Examining relationships – two-way cross-tabulations
Summary
3. Lazy Learning – Classification Using Nearest Neighbors
Understanding nearest neighbor classification
The k-NN algorithm
Measuring similarity with distance
Choosing an appropriate k
Preparing data for use with k-NN
Why is the k-NN algorithm lazy?
Example – diagnosing breast cancer with the k-NN algorithm
Step 1 – collecting data
Step 2 – exploring and preparing the data
Transformation – normalizing numeric data
Data preparation – creating training and test datasets
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Transformation – z-score standardization
Testing alternative values of k
Summary
4. Probabilistic Learning – Classification Using Naive Bayes
Understanding Naive Bayes
Basic concepts of Bayesian methods
Understanding probability
Understanding joint probability
Computing conditional probability with Bayes' theorem
The Naive Bayes algorithm
Classification with Naive Bayes
The Laplace estimator
Using numeric features with Naive Bayes
Example – filtering mobile phone spam with the Naive Bayes algorithm
Step 1 – collecting data
Step 2 – exploring and preparing the data
Data preparation – cleaning and standardizing text data
Data preparation – splitting text documents into words
Data preparation – creating training and test datasets
Visualizing text data – word clouds
Data preparation – creating indicator features for frequent words
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Summary
5. Divide and Conquer – Classification Using Decision Trees and Rules
Understanding decision trees
Divide and conquer
The C5.0 decision tree algorithm
Choosing the best split
Pruning the decision tree
Example – identifying risky bank loans using C5.0 decision trees
Step 1 – collecting data
Step 2 – exploring and preparing the data
Data preparation – creating random training and test datasets
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Boosting the accuracy of decision trees
Making some mistakes cost more than others
Understanding classification rules
Separate and conquer
The 1R algorithm
The RIPPER algorithm
Rules from decision trees
What makes trees and rules greedy?
Example – identifying poisonous mushrooms with rule learners
Step 1 – collecting data
Step 2 – exploring and preparing the data
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Summary
6. Forecasting Numeric Data – Regression Methods
Understanding regression
Simple linear regression
Ordinary least squares estimation
Correlations
Multiple linear regression
Example – predicting medical expenses using linear regression
Step 1 – collecting data
Step 2 – exploring and preparing the data
Exploring relationships among features – the correlation matrix
Visualizing relationships among features – the scatterplot matrix
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Model specification – adding nonlinear relationships
Transformation – converting a numeric variable to a binary indicator
Model specification – adding interaction effects
Putting it all together – an improved regression model
Making predictions with a regression model
Understanding regression trees and model trees
Adding regression to trees
Example – estimating the quality of wines with regression trees and model trees
Step 1 – collecting data
Step 2 – exploring and preparing the data
Step 3 – training a model on the data
Visualizing decision trees
Step 4 – evaluating model performance
Measuring performance with the mean absolute error
Step 5 – improving model performance
Summary
7. Black Box Methods – Neural Networks and Support Vector Machines
Understanding neural networks
From biological to artificial neurons
Activation functions
Network topology
The number of layers
The direction of information travel
The number of nodes in each layer
Training neural networks with backpropagation
Example – modeling the strength of concrete with ANNs
Step 1 – collecting data
Step 2 – exploring and preparing the data
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Understanding support vector machines
Classification with hyperplanes
The case of linearly separable data
The case of nonlinearly separable data
Using kernels for nonlinear spaces
Example – performing OCR with SVMs
Step 1 – collecting data
Step 2 – e...

Table of contents

  1. Machine Learning with R - Third Edition