Hands-On Ensemble Learning with Python
eBook - ePub

Hands-On Ensemble Learning with Python

Build highly optimized ensemble machine learning models using scikit-learn and Keras

George Kyriakides, Konstantinos G. Margaritis

Partager le livre
  1. 298 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Hands-On Ensemble Learning with Python

Build highly optimized ensemble machine learning models using scikit-learn and Keras

George Kyriakides, Konstantinos G. Margaritis

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

Combine popular machine learning techniques to create ensemble models using Python

Key Features

  • Implement ensemble models using algorithms such as random forests and AdaBoost
  • Apply boosting, bagging, and stacking ensemble methods to improve the prediction accuracy of your model
  • Explore real-world data sets and practical examples coded in scikit-learn and Keras

Book Description

Ensembling is a technique of combining two or more similar or dissimilar machine learning algorithms to create a model that delivers superior predictive power. This book will demonstrate how you can use a variety of weak algorithms to make a strong predictive model.

With its hands-on approach, you'll not only get up to speed on the basic theory but also the application of various ensemble learning techniques. Using examples and real-world datasets, you'll be able to produce better machine learning models to solve supervised learning problems such as classification and regression. Furthermore, you'll go on to leverage ensemble learning techniques such as clustering to produce unsupervised machine learning models. As you progress, the chapters will cover different machine learning algorithms that are widely used in the practical world to make predictions and classifications. You'll even get to grips with the use of Python libraries such as scikit-learn and Keras for implementing different ensemble models.

By the end of this book, you will be well-versed in ensemble learning, and have the skills you need to understand which ensemble method is required for which problem, and successfully implement them in real-world scenarios.

What you will learn

  • Implement ensemble methods to generate models with high accuracy
  • Overcome challenges such as bias and variance
  • Explore machine learning algorithms to evaluate model performance
  • Understand how to construct, evaluate, and apply ensemble models
  • Analyze tweets in real time using Twitter's streaming API
  • Use Keras to build an ensemble of neural networks for the MovieLens dataset

Who this book is for

This book is for data analysts, data scientists, machine learning engineers and other professionals who are looking to generate advanced models using ensemble techniques. An understanding of Python code and basic knowledge of statistics is required to make the most out of this book.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Hands-On Ensemble Learning with Python est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Hands-On Ensemble Learning with Python par George Kyriakides, Konstantinos G. Margaritis en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Computer Vision & Pattern Recognition. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2019
ISBN
9781789617887

Section 1: Introduction and Required Software Tools

This section is a refresher on basic machine learning concepts and an introduction to ensemble learning. We will have an overview of machine learning and various concepts pertaining to it, such as train and test sets, supervised and unsupervised learning, and more. We will also learn about the concept of ensemble learning.
This section comprises the following chapters:
  • Chapter 1, A Machine Learning Refresher
  • Chapter 2, Getting Started with Ensemble Learning

A Machine Learning Refresher

Machine learning is a sub field of artificial intelligence (AI) focused on the aim of developing algorithms and techniques that enable computers to learn from massive amounts of data. Given the increasing rate at which data is produced, machine learning has played a critical role in solving difficult problems in recent years. This success was the main driving force behind the funding and development of many great machine learning libraries that make use of data in order to build predictive models. Furthermore, businesses have started to realize the potential of machine learning, driving the demand for data scientists and machine learning engineers to new heights, in order to design better-performing predictive models.
This chapter serves as a refresher on the main concepts and terminology, as well as an introduction to the frameworks that will be used throughout the book, in order to approach ensemble learning with a solid foundation.
The main topics covered in this chapter are the following:
  • The various machine learning problems and datasets
  • How to evaluate the performance of a predictive model
  • Machine learning algorithms
  • Python environment setup and the required libraries

Technical requirements

You will require basic knowledge of machine learning techniques and algorithms. Furthermore, a knowledge of python conventions and syntax is required. Finally, familiarity with the NumPy library will greatly help the reader to understand some custom algorithm implementations.
The code files of this chapter can be found on GitHub:
https://github.com/PacktPublishing/Hands-On-Ensemble-Learning-with-Python/tree/master/Chapter01
Check out the following video to see the Code in Action: http://bit.ly/30u8sv8.

Learning from data

Data is the raw ingredient of machine learning. Processing data can produce information; for example, measuring the height of a portion of a school's students (data) and calculating their average (processing) can give us an idea of the whole school's height (information). If we process the data further, for example, by grouping males and females and calculating two averages – one for each group, we will gain more information, as we will have an idea about the average height of the school's males and females. Machine learning strives to produce the most information possible from any given data. In this example, we produced a very basic predictive model. By calculating the two averages, we can predict the average height of any student just by knowing whether the student is male or female.
The set of data that a machine learning algorithm is tasked with processing is called the problem's dataset. In our example, the dataset consists of height measurements (in centimeters) and the child's sex (male/female). In machine learning, input variables are called features and output variables are called targets. In this dataset, the features of our predictive model consist solely of the students' sex, while our target is the students' height in centimeters. The predictive model that is produced and maps features to targets will be referred to as simply the model from now on, unless otherwise specified. Each data point is called an instance. In this problem, each student is an instance of the dataset.
When the target is a continuous variable (a number), it presents a regression problem, as the aim is to regress the target on the features. When the target is a set of categories, it presents a classification problem, as we try to assign each instance to a category or class.
Note that, in classification problems, the target class can be represented by a number; this does not mean that it is a regression problem. The most useful way to determine whether it is a regression problem is to think about whether the instances can be ordered by their targets. In our example, the target is height, so we can order the students from tallest to shortest, as 100 cm is less than 110 cm. As a counter example, if the target was their favorite color, we could represent each color by a number, but we could not order them. Even if we represented red as one and blue as two, we could not say that red is "before" or "less than" blue. Thus, this counter example is a classification problem.

Popular machine lear...

Table des matiĂšres