Machine Learning with R
eBook - ePub

Machine Learning with R

Brett Lantz

Partager le livre
  1. 396 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Machine Learning with R

Brett Lantz

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

In Detail

Machine learning, at its core, is concerned with transforming data into actionable knowledge. This fact makes machine learning well-suited to the present-day era of "big data" and "data science". Given the growing prominence of Ra cross-platform, zero-cost statistical programming environmentthere has never been a better time to start applying machine learning. Whether you are new to data science or a veteran, machine learning with R offers a powerful set of methods for quickly and easily gaining insight from your data.

"Machine Learning with R" is a practical tutorial that uses hands-on examples to step through real-world application of machine learning. Without shying away from the technical details, we will explore Machine Learning with R using clear and practical examples. Well-suited to machine learning beginners or those with experience. Explore R to find the answer to all of your questions.

How can we use machine learning to transform data into action? Using practical examples, we will explore how to prepare data for analysis, choose a machine learning method, and measure the success of the process.

We will learn how to apply machine learning methods to a variety of common tasks including classification, prediction, forecasting, market basket analysis, and clustering. By applying the most effective machine learning methods to real-world problems, you will gain hands-on experience that will transform the way you think about data.

"Machine Learning with R" will provide you with the analytical tools you need to quickly gain insight from complex data.

Approach

Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating the results. These steps will build the knowledge you need to apply them to your own data science tasks.

Who this book is for

Intended for those who want to learn how to use R's machine learning capabilities and gain insight from your data. Perhaps you already know a bit about machine learning, but have never used R; or perhaps you know a little R but are new to machine learning. In either case, this book will get you up and running quickly. It would be helpful to have a bit of familiarity with basic programming concepts, but no prior experience is required.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Machine Learning with R est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Machine Learning with R par Brett Lantz en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Data Processing. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2013
ISBN
9781782162148

Machine Learning with R


Table of Contents

Machine Learning with R
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Introducing Machine Learning
The origins of machine learning
Uses and abuses of machine learning
Ethical considerations
How do machines learn?
Abstraction and knowledge representation
Generalization
Assessing the success of learning
Steps to apply machine learning to your data
Choosing a machine learning algorithm
Thinking about the input data
Thinking about types of machine learning algorithms
Matching your data to an appropriate algorithm
Using R for machine learning
Installing and loading R packages
Installing an R package
Installing a package using the point-and-click interface
Loading an R package
Summary
2. Managing and Understanding Data
R data structures
Vectors
Factors
Lists
Data frames
Matrixes and arrays
Managing data with R
Saving and loading R data structures
Importing and saving data from CSV files
Importing data from SQL databases
Exploring and understanding data
Exploring the structure of data
Exploring numeric variables
Measuring the central tendency – mean and median
Measuring spread – quartiles and the five-number summary
Visualizing numeric variables – boxplots
Visualizing numeric variables – histograms
Understanding numeric data – uniform and normal distributions
Measuring spread – variance and standard deviation
Exploring categorical variables
Measuring the central tendency – the mode
Exploring relationships between variables
Visualizing relationships – scatterplots
Examining relationships – two-way cross-tabulations
Summary
3. Lazy Learning – Classification Using Nearest Neighbors
Understanding classification using nearest neighbors
The kNN algorithm
Calculating distance
Choosing an appropriate k
Preparing data for use with kNN
Why is the kNN algorithm lazy?
Diagnosing breast cancer with the kNN algorithm
Step 1 – collecting data
Step 2 – exploring and preparing the data
Transformation – normalizing numeric data
Data preparation – creating training and test datasets
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Transformation – z-score standardization
Testing alternative values of k
Summary
4. Probabilistic Learning – Classification Using Naive Bayes
Understanding naive Bayes
Basic concepts of Bayesian methods
Probability
Joint probability
Conditional probability with Bayes' theorem
The naive Bayes algorithm
The naive Bayes classification
The Laplace estimator
Using numeric features with naive Bayes
Example – filtering mobile phone spam with the naive Bayes algorithm
Step 1 – collecting data
Step 2 – exploring and preparing the data
Data preparation – processing text data for analysis
Data preparation – creating training and test datasets
Visualizing text data – word clouds
Data preparation – creating indicator features for frequent words
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Summary
5. Divide and Conquer – Classification Using Decision Trees and Rules
Understanding decision trees
Divide and conquer
The C5.0 decision tree algorithm
Choosing the best split
Pruning the decision tree
Example – identifying risky bank loans using C5.0 decision trees
Step 1 – collecting data
Step 2 – exploring and preparing the data
Data preparation – creating random training and test datasets
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Boosting the accuracy of decision trees
Making some mistakes more costly than others
Understanding classification rules
Separate and conquer
The One Rule algorithm
The RIPPER algorithm
Rules from decision trees
Example – identifying poisonous mushrooms with rule learners
Step 1 – collecting data
Step 2 – exploring and preparing the data
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Summary
6. Forecasting Numeric Data – Regression Methods
Understanding regression
Simple linear regression
Ordinary least squares estimation
Correlations
Multiple linear regression
Example – predicting medical expenses using linear regression
Step 1 – collecting data
Step 2 – exploring and preparing the data
Exploring relationships among features – the correlation matrix
Visualizing relationships among features – the scatterplot matrix
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Model specification – adding non-linear relationships
Transformation – converting a numeric variable to a binary indicator
Model specification – adding interaction effects
Putting it all together – an improved regression model
Understanding regression trees and model trees
Adding regression to trees
Example – estimating the quality of wines with regression trees and model trees
Step 1 – collecting data
Step 2 – exploring and preparing the data
Step 3 – training a model on the data
Visualizing decision trees
Step 4 – evaluating model performance
Measuring performance with mean absolute error
Step 5 – improving model performance
Summary
7. Black Box Methods – Neural Networks and Support Vector Machines
Understanding neural networks
From biological to artificial neurons
Activation functions
Network topology
The number of layers
The direction of information travel
The number of nodes in each layer
Training neural networks with backpropagation
Modeling the strength of concrete with ANNs
Step 1 – collecting data
Step 2 – exploring and preparing the data
Step 3 – training a model on the data
Step 4 – evaluating model performance
Step 5 – improving model performance
Understanding Support Vector Machines
Classification with hyperplanes
Finding the maximum margin
Th...

Table des matiĂšres