eBook - ePub

Machine Learning with R

Name: Machine Learning with R
Author: Brett Lantz

Brett Lantz

Partager le livre

396 pages
English
ePUB (adapté aux mobiles)
Disponible sur iOS et Android

eBook - ePub

Machine Learning with R

Brett Lantz

Détails du livre

Aperçu du livre

Table des matières

Citations

À propos de ce livre

In Detail

Machine learning, at its core, is concerned with transforming data into actionable knowledge. This fact makes machine learning well-suited to the present-day era of "big data" and "data science". Given the growing prominence of Ra cross-platform, zero-cost statistical programming environmentthere has never been a better time to start applying machine learning. Whether you are new to data science or a veteran, machine learning with R offers a powerful set of methods for quickly and easily gaining insight from your data.

"Machine Learning with R" is a practical tutorial that uses hands-on examples to step through real-world application of machine learning. Without shying away from the technical details, we will explore Machine Learning with R using clear and practical examples. Well-suited to machine learning beginners or those with experience. Explore R to find the answer to all of your questions.

How can we use machine learning to transform data into action? Using practical examples, we will explore how to prepare data for analysis, choose a machine learning method, and measure the success of the process.

We will learn how to apply machine learning methods to a variety of common tasks including classification, prediction, forecasting, market basket analysis, and clustering. By applying the most effective machine learning methods to real-world problems, you will gain hands-on experience that will transform the way you think about data.

"Machine Learning with R" will provide you with the analytical tools you need to quickly gain insight from complex data.

Approach

Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating the results. These steps will build the knowledge you need to apply them to your own data science tasks.

Who this book is for

Intended for those who want to learn how to use R's machine learning capabilities and gain insight from your data. Perhaps you already know a bit about machine learning, but have never used R; or perhaps you know a little R but are new to machine learning. In either case, this book will get you up and running quickly. It would be helpful to have a bit of familiarity with basic programming concepts, but no prior experience is required.

Foire aux questions

Comment puis-je résilier mon abonnement ?

Il vous suffit de vous rendre dans la section compte dans paramètres et de cliquer sur « Résilier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez résilié votre abonnement, il restera actif pour le reste de la période pour laquelle vous avez payé. Découvrez-en plus ici.

Puis-je / comment puis-je télécharger des livres ?

Pour le moment, tous nos livres en format ePub adaptés aux mobiles peuvent être téléchargés via l’application. La plupart de nos PDF sont également disponibles en téléchargement et les autres seront téléchargeables très prochainement. Découvrez-en plus ici.

Quelle est la différence entre les formules tarifaires ?

Les deux abonnements vous donnent un accès complet à la bibliothèque et à toutes les fonctionnalités de Perlego. Les seules différences sont les tarifs ainsi que la période d’abonnement : avec l’abonnement annuel, vous économiserez environ 30 % par rapport à 12 mois d’abonnement mensuel.

Qu’est-ce que Perlego ?

Nous sommes un service d’abonnement à des ouvrages universitaires en ligne, où vous pouvez accéder à toute une bibliothèque pour un prix inférieur à celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! Découvrez-en plus ici.

Prenez-vous en charge la synthèse vocale ?

Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte à haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accélérer ou le ralentir. Découvrez-en plus ici.

Est-ce que Machine Learning with R est un PDF/ePUB en ligne ?

Oui, vous pouvez accéder à Machine Learning with R par Brett Lantz en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Data Processing. Nous disposons de plus d’un million d’ouvrages à découvrir dans notre catalogue.

Informations

Éditeur

Packt Publishing

Année

2013

ISBN

9781782162148

Sujet

Computer Science

Sous-sujet

Data Processing

Machine Learning with R

Credits

About the Author

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers and more

Why Subscribe?

Free Access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Introducing Machine Learning

The origins of machine learning

Uses and abuses of machine learning

Ethical considerations

How do machines learn?

Abstraction and knowledge representation

Generalization

Assessing the success of learning

Steps to apply machine learning to your data

Choosing a machine learning algorithm

Thinking about the input data

Thinking about types of machine learning algorithms

Matching your data to an appropriate algorithm

Using R for machine learning

Installing and loading R packages

Installing an R package

Installing a package using the point-and-click interface

Loading an R package

Summary

2. Managing and Understanding Data

R data structures

Vectors

Factors

Lists

Data frames

Matrixes and arrays

Managing data with R

Saving and loading R data structures

Importing and saving data from CSV files

Importing data from SQL databases

Exploring and understanding data

Exploring the structure of data

Exploring numeric variables

Measuring the central tendency – mean and median

Measuring spread – quartiles and the five-number summary

Visualizing numeric variables – boxplots

Visualizing numeric variables – histograms

Understanding numeric data – uniform and normal distributions

Measuring spread – variance and standard deviation

Exploring categorical variables

Measuring the central tendency – the mode

Exploring relationships between variables

Visualizing relationships – scatterplots

Examining relationships – two-way cross-tabulations

Summary

3. Lazy Learning – Classification Using Nearest Neighbors

Understanding classification using nearest neighbors

The kNN algorithm

Calculating distance

Choosing an appropriate k

Preparing data for use with kNN

Why is the kNN algorithm lazy?

Diagnosing breast cancer with the kNN algorithm

Step 1 – collecting data

Step 2 – exploring and preparing the data

Transformation – normalizing numeric data

Data preparation – creating training and test datasets

Step 3 – training a model on the data

Step 4 – evaluating model performance

Step 5 – improving model performance

Transformation – z-score standardization

Testing alternative values of k

Summary

4. Probabilistic Learning – Classification Using Naive Bayes

Understanding naive Bayes

Basic concepts of Bayesian methods

Probability

Joint probability

Conditional probability with Bayes' theorem

The naive Bayes algorithm

The naive Bayes classification

The Laplace estimator

Using numeric features with naive Bayes

Example – filtering mobile phone spam with the naive Bayes algorithm

Step 1 – collecting data

Step 2 – exploring and preparing the data

Data preparation – processing text data for analysis

Data preparation – creating training and test datasets

Visualizing text data – word clouds

Data preparation – creating indicator features for frequent words

Step 3 – training a model on the data

Step 4 – evaluating model performance

Step 5 – improving model performance

Summary

5. Divide and Conquer – Classification Using Decision Trees and Rules

Understanding decision trees

Divide and conquer

The C5.0 decision tree algorithm

Choosing the best split

Pruning the decision tree

Example – identifying risky bank loans using C5.0 decision trees

Step 1 – collecting data

Step 2 – exploring and preparing the data

Data preparation – creating random training and test datasets

Step 3 – training a model on the data

Step 4 – evaluating model performance

Step 5 – improving model performance

Boosting the accuracy of decision trees

Making some mistakes more costly than others

Understanding classification rules

Separate and conquer

The One Rule algorithm

The RIPPER algorithm

Rules from decision trees

Example – identifying poisonous mushrooms with rule learners

Step 1 – collecting data

Step 2 – exploring and preparing the data

Step 3 – training a model on the data

Step 4 – evaluating model performance

Step 5 – improving model performance

Summary

6. Forecasting Numeric Data – Regression Methods

Understanding regression

Simple linear regression

Ordinary least squares estimation

Correlations

Multiple linear regression

Example – predicting medical expenses using linear regression

Step 1 – collecting data

Step 2 – exploring and preparing the data

Exploring relationships among features – the correlation matrix

Visualizing relationships among features – the scatterplot matrix

Step 3 – training a model on the data

Step 4 – evaluating model performance

Step 5 – improving model performance

Model specification – adding non-linear relationships

Transformation – converting a numeric variable to a binary indicator

Model specification – adding interaction effects

Putting it all together – an improved regression model

Understanding regression trees and model trees

Adding regression to trees

Example – estimating the quality of wines with regression trees and model trees

Step 1 – collecting data

Step 2 – exploring and preparing the data

Step 3 – training a model on the data

Visualizing decision trees

Step 4 – evaluating model performance

Measuring performance with mean absolute error

Step 5 – improving model performance

Summary

7. Black Box Methods – Neural Networks and Support Vector Machines

Understanding neural networks

From biological to artificial neurons

Activation functions

Network topology

The number of layers

The direction of information travel

The number of nodes in each layer

Training neural networks with backpropagation

Modeling the strength of concrete with ANNs

Step 1 – collecting data

Step 2 – exploring and preparing the data

Step 3 – training a model on the data

Step 4 – evaluating model performance

Step 5 – improving model performance

Understanding Support Vector Machines

Classification with hyperplanes

Finding the maximum margin

Th...

À propos de ce livre

In Detail

Approach

Who this book is for

Foire aux questions

Informations

Machine Learning with R

Table of Contents

Table des matières