eBook - ePub

Building Machine Learning Systems with Python

Name: Building Machine Learning Systems with Python
Author: Willi Richert, Luis Pedro Coelho

Willi Richert, Luis Pedro Coelho

Partager le livre

290 pages
English
ePUB (adapté aux mobiles)
Disponible sur iOS et Android

eBook - ePub

Building Machine Learning Systems with Python

Willi Richert, Luis Pedro Coelho

Détails du livre

Aperçu du livre

Table des matières

Citations

À propos de ce livre

In Detail

Machine learning, the field of building systems that learn from data, is exploding on the Web and elsewhere. Python is a wonderful language in which to develop machine learning applications. As a dynamic language, it allows for fast exploration and experimentation and an increasing number of machine learning libraries are developed for Python.

Building Machine Learning system with Python shows you exactly how to find patterns through raw data. The book starts by brushing up on your Python ML knowledge and introducing libraries, and then moves on to more serious projects on datasets, Modelling, Recommendations, improving recommendations through examples and sailing through sound and image processing in detail.

Using open-source tools and libraries, readers will learn how to apply methods to text, images, and sounds. You will also learn how to evaluate, compare, and choose machine learning techniques.

Written for Python programmers, Building Machine Learning Systems with Python teaches you how to use open-source libraries to solve real problems with machine learning. The book is based on real-world examples that the user can build on.

Readers will learn how to write programs that classify the quality of StackOverflow answers or whether a music file is Jazz or Metal. They will learn regression, which is demonstrated on how to recommend movies to users. Advanced topics such as topic modeling (finding a text's most important topics), basket analysis, and cloud computing are covered as well as many other interesting aspects.

Building Machine Learning Systems with Python will give you the tools and understanding required to build your own systems, which are tailored to solve your problems.

Approach

A practical, scenario-based tutorial, this book will help you get to grips with machine learning with Python and start building your own machine learning projects. By the end of the book you will have learnt critical aspects of machine learning Python projects and experienced the power of ML-based systems by actually working on them.

Who this book is for

This book is for Python programmers who are beginners in machine learning, but want to learn Machine learning. Readers are expected to know Python and be able to install and use open-source libraries. They are not expected to know machine learning, although the book can also serve as an introduction to some Python libraries for readers who know machine learning. This book does not go into the detail of the mathematics behind the algorithms.

This book primarily targets Python developers who want to learn and build machine learning in their projects, or who want to provide machine learning support to their existing projects, and see them getting implemented effectively.

Foire aux questions

Comment puis-je résilier mon abonnement ?

Il vous suffit de vous rendre dans la section compte dans paramètres et de cliquer sur « Résilier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez résilié votre abonnement, il restera actif pour le reste de la période pour laquelle vous avez payé. Découvrez-en plus ici.

Puis-je / comment puis-je télécharger des livres ?

Pour le moment, tous nos livres en format ePub adaptés aux mobiles peuvent être téléchargés via l’application. La plupart de nos PDF sont également disponibles en téléchargement et les autres seront téléchargeables très prochainement. Découvrez-en plus ici.

Quelle est la différence entre les formules tarifaires ?

Les deux abonnements vous donnent un accès complet à la bibliothèque et à toutes les fonctionnalités de Perlego. Les seules différences sont les tarifs ainsi que la période d’abonnement : avec l’abonnement annuel, vous économiserez environ 30 % par rapport à 12 mois d’abonnement mensuel.

Qu’est-ce que Perlego ?

Nous sommes un service d’abonnement à des ouvrages universitaires en ligne, où vous pouvez accéder à toute une bibliothèque pour un prix inférieur à celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! Découvrez-en plus ici.

Prenez-vous en charge la synthèse vocale ?

Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte à haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accélérer ou le ralentir. Découvrez-en plus ici.

Est-ce que Building Machine Learning Systems with Python est un PDF/ePUB en ligne ?

Oui, vous pouvez accéder à Building Machine Learning Systems with Python par Willi Richert, Luis Pedro Coelho en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Informatica et Programmazione in Python. Nous disposons de plus d’un million d’ouvrages à découvrir dans notre catalogue.

Informations

Éditeur

Packt Publishing

Année

2013

ISBN

9781782161400

Édition

Sujet

Informatica

Sous-sujet

Programmazione in Python

Building Machine Learning Systems with Python

Credits

About the Authors

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers and more

Why Subscribe?

Free Access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Getting Started with Python Machine Learning

Machine learning and Python – the dream team

What the book will teach you (and what it will not)

What to do when you are stuck

Getting started

Introduction to NumPy, SciPy, and Matplotlib

Installing Python

Chewing data efficiently with NumPy and intelligently with SciPy

Learning NumPy

Indexing

Handling non-existing values

Comparing runtime behaviors

Learning SciPy

Our first (tiny) machine learning application

Reading in the data

Preprocessing and cleaning the data

Choosing the right model and learning algorithm

Before building our first model

Starting with a simple straight line

Towards some advanced stuff

Stepping back to go forward – another look at our data

Training and testing

Answering our initial question

Summary

2. Learning How to Classify with Real-world Examples

The Iris dataset

The first step is visualization

Building our first classification model

Evaluation – holding out data and cross-validation

Building more complex classifiers

A more complex dataset and a more complex classifier

Learning about the Seeds dataset

Features and feature engineering

Nearest neighbor classification

Binary and multiclass classification

Summary

3. Clustering – Finding Related Posts

Measuring the relatedness of posts

How not to do it

How to do it

Preprocessing – similarity measured as similar number of common words

Converting raw text into a bag-of-words

Counting words

Normalizing the word count vectors

Removing less important words

Stemming

Installing and using NLTK

Extending the vectorizer with NLTK's stemmer

Stop words on steroids

Our achievements and goals

Clustering

KMeans

Getting test data to evaluate our ideas on

Clustering posts

Solving our initial challenge

Another look at noise

Tweaking the parameters

Summary

4. Topic Modeling

Latent Dirichlet allocation (LDA)

Building a topic model

Comparing similarity in topic space

Modeling the whole of Wikipedia

Choosing the number of topics

Summary

5. Classification – Detecting Poor Answers

Sketching our roadmap

Learning to classify classy answers

Tuning the instance

Tuning the classifier

Fetching the data

Slimming the data down to chewable chunks

Preselection and processing of attributes

Defining what is a good answer

Creating our first classifier

Starting with the k-nearest neighbor (kNN) algorithm

Engineering the features

Training the classifier

Measuring the classifier's performance

Designing more features

Deciding how to improve

Bias-variance and its trade-off

Fixing high bias

Fixing high variance

High bias or low bias

Using logistic regression

A bit of math with a small example

Applying logistic regression to our postclassification problem

Looking behind accuracy – precision and recall

Slimming the classifier

Ship it!

Summary

6. Classification II – Sentiment Analysis

Sketching our roadmap

Fetching the Twitter data

Introducing the Naive Bayes classifier

Getting to know the Bayes theorem

Being naive

Using Naive Bayes to classify

Accounting for unseen words and other oddities

Accounting for arithmetic underflows

Creating our first classifier and tuning it

Solving an easy problem first

Using all the classes

Tuning the classifier's parameters

Cleaning tweets

Taking the word types into account

Determining the word types

Successfully cheating using SentiWordNet

Our first estimator

Putting everything together

Summary

7. Regression – Recommendations

Predicting house prices with regression

Multidimensional regression

Cross-validation for regression

Penalized regression

L1 and L2 penalties

Using Lasso or Elastic nets in scikit-learn

P greater than N scenarios

An example based on text

Setting hyperparameters in a smart way

Rating prediction and recommendations

Summary

8. Regression – Recommendations Improved

Improved recommendations

Using the binary matrix of recommendations

Looking at the movie neighbors

Combining multiple methods

Basket analysis

Obtaining useful predictions

Analyzing supermarket shopping baskets

Association rule mining

More advanced basket analysis

Summary

9. Classification III – Music Genre Classification

Sketching our roadmap

Fetching the music data

Converting into a wave format

Looking at music

Decomposing music into sine wave components

Using FFT to build our first classifier

Increasing experimentation agility

Training the classifier

Using the confusion matrix to measure accuracy in multiclass problems

An alternate way to measure classifier performance using receiver operator characteristic (ROC)

Improving classification performance with Mel Frequency Cepstral Coefficients

Summary

10. Computer Vision – Pattern Recognition

Introducing image processing

Loading and displaying images

Basic image processing

Thresholding

Gaussian blurring

Filtering for different effects

Adding salt and pepper noise

Putting the center in focus

Pattern recognition

Computing features from images

Writing your own features

Classifying a harder dataset

Local feature representations

Summary

11. Dimensionality Reduction

Sketching our roadmap

Selecting features

Detecting redundant features using filters

Correlation

Mutual information

Asking the model about the features using wrappers

Other feature selection methods

Feature extraction

About principal component analysis (PCA)

Sketching PCA

Applying PCA

Limitations of PCA and how LDA can help

Multidimensional scaling (MDS)

Summary

12. Big(ger) Data

Learning about big data

Using jug to break up your pipeline into tasks

About tasks

Reusing partial results

Looking un...

À propos de ce livre

In Detail

Approach

Who this book is for

Foire aux questions

Informations

Building Machine Learning Systems with Python

Table of Contents

Table des matières