Applied Supervised Learning with Python
eBook - ePub

Applied Supervised Learning with Python

Use scikit-learn to build predictive models from real-world datasets and prepare yourself for the future of machine learning

Benjamin Johnston, Ishita Mathur

Partager le livre
  1. 404 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Applied Supervised Learning with Python

Use scikit-learn to build predictive models from real-world datasets and prepare yourself for the future of machine learning

Benjamin Johnston, Ishita Mathur

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

Explore the exciting world of machine learning with the fastest growing technology in the world

Key Features

  • Understand various machine learning concepts with real-world examples
  • Implement a supervised machine learning pipeline from data ingestion to validation
  • Gain insights into how you can use machine learning in everyday life

Book Description

Machine learning—the ability of a machine to give right answers based on input data—has revolutionized the way we do business. Applied Supervised Learning with Python provides a rich understanding of how you can apply machine learning techniques in your data science projects using Python. You'll explore Jupyter Notebooks, the technology used commonly in academic and commercial circles with in-line code running support.

With the help of fun examples, you'll gain experience working on the Python machine learning toolkit—from performing basic data cleaning and processing to working with a range of regression and classification algorithms. Once you've grasped the basics, you'll learn how to build and train your own models using advanced techniques such as decision trees, ensemble modeling, validation, and error metrics. You'll also learn data visualization techniques using powerful Python libraries such as Matplotlib and Seaborn.

This book also covers ensemble modeling and random forest classifiers along with other methods for combining results from multiple models, and concludes by delving into cross-validation to test your algorithm and check how well the model works on unseen data.

By the end of this book, you'll be equipped to not only work with machine learning algorithms, but also be able to create some of your own!

What you will learn

  • Understand the concept of supervised learning and its applications
  • Implement common supervised learning algorithms using machine learning Python libraries
  • Validate models using the k-fold technique
  • Build your models with decision trees to get results effortlessly
  • Use ensemble modeling techniques to improve the performance of your model
  • Apply a variety of metrics to compare machine learning models

Who this book is for

Applied Supervised Learning with Python is for you if you want to gain a solid understanding of machine learning using Python. It'll help if you to have some experience in any functional or object-oriented language and a basic understanding of Python libraries and expressions, such as arrays and dictionaries.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Applied Supervised Learning with Python est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Applied Supervised Learning with Python par Benjamin Johnston, Ishita Mathur en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Informatik et Datenbanken. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2019
ISBN
9781789955835
Édition
1
Sous-sujet
Datenbanken

Chapter 1

Python Machine Learning Toolkit

Learning Objectives

By the end of this chapter, you will be able to:
  • Explain supervised machine learning and describe common examples of machine learning problems
  • Install and load Python libraries into your development environment for use in analysis and machine learning problems
  • Access and interpret the documentation of a subset of Python libraries, including the powerful pandas library
  • Create an IPython Jupyter notebook and use executable code cells and markdown cells to create a dynamic report
  • Load an external data source using pandas and use a variety of methods to search, filter, and compute descriptive statistics of the data
  • Clean a data source of mediocre quality and gauge the potential impact of various issues within the data source
This chapter introduces supervised learning, Jupyter notebooks, and some of the most common pandas data methods.

Introduction

The study and application of machine learning and artificial intelligence has recently been the source of much interest and research in the technology and business communities. Advanced data analytics and machine learning techniques have shown great promise in advancing many sectors, such as personalized healthcare and self-driving cars, as well as in solving some of the world's greatest challenges, such as combating climate change. This book has been designed to assist you in taking advantage of the unique confluence of events in the field of data science and machine learning today. Across the globe, private enterprises and governments are realizing the value and efficiency of data-driven products and services. At the same time, reduced hardware costs and open source software solutions are significantly reducing the barriers to entry of learning and applying machine learning techniques.
Throughout this book, you will develop the skills required to identify, prepare, and build predictive models using supervised machine learning techniques in the Python programming language. The six chapters each cover one aspect of supervised learning. This chapter introduces a subset of the Python machine learning toolkit, as well as some of the things that need to be considered when loading and using data sources. This data exploration process is further explored in Chapter 2, Exploratory Data Analysis and Visualization, as we introduce exploratory data analysis and visualization. Chapter 3, Regression Analysis, and Chapter 4, Classification, look at two subsets of machine learning problems – regression and classification analysis – and demonstrate these techniques through examples. Finally, Chapter 5, Ensemble Modeling, covers ensemble networks, which use multiple predictions from different models to boost overall performance, while Chapter 6, Model Evaluation, covers the extremely important concepts of validation and evaluation metrics. These metrics provide a means of estimating the true performance of a model.

Supervised Machine Learning

A machine learning algorithm is commonly thought of as simply the mathematical process (or algorithm) itself, such as a neural network, deep neural network, or random forest algorithm. However, this is only a component of the overall system; firstly, we must define the problem that can be adequately solved using such techniques. Then, we must specify and procure a clean dataset that is composed of information that can be mapped from the first number space to a secondary one. Once the dataset has been designed and procured, the machine learning model can be specified and designed; for example, a single-layer neural network with 100 hidden nodes that uses a tanh activation function.
With the dataset and model well defined, the means of determining the exact values for the model can be specified. This is a repetitive optimization process that evaluates the output of the model against some existing data and is commonly referred to as training. Once training has been completed and you have your defined model, then it is good practice to evaluate it against some reference data to provide a benchmark of overall performance.
Considering this general description of a complete machine learning algorithm, the problem definition and data collection stages are often the most critical. What is the problem you are trying to solve? What outcome would you like to achieve? How are you going to achieve it? How you answer these questions will drive and define many of the subsequent decisions or model design choices. It is also in answering these questions that we will select which category of machine learning algorithms we will choose: supervised or unsupervised methods.
So, what exactly are supervised and unsupervised machine learning problems or methods? Supervised learning techniques center on mapping some set of information to another by providing the training process with the input information and the desired outputs, then checking its ability to provide the correct result. As an example, let's say you are the publisher of a magazine that reviews and ranks hairstyles from various time periods. Your readers frequently send you far more images of their favorite hairstyles for review than you can manually process. To save some time, you would like to automate the sorting of the hairstyles images you receive based on time periods, starting with hairstyles from the 1960s and 1980s:
Figure 1.1: Hairstyles images from different time periods
Figure 1.1: Hairstyles images from different time periods
To create your hairstyles-sorting algorithm, you start by collecting a large sample of hairstyles images and manually labeling each one with its corresponding time period. Such a dataset (known as a labeled dataset) is the input data (hairstyles images) and the desired output information (time period) is known and recorded. This type of problem is a classic supervised learning problem; we are trying to develop an algorithm that takes a set of inputs and learns to return the answers that we have told it are correct.

When to Use Supervised Learning

Generally, if you are trying to automate or replicate an existing process, the problem is a supervised learning problem. Supervised learning techniques are both very useful and powerful, and you may have come across them or even helped create labeled datasets for them without realizing. As an example, a few years ago, Facebook introduced the ability to tag your friends in any image uploaded to the platform. To tag a friend, you would draw a square over your friend's face and then add the name of your friend to notify them of the image. Fast-forward to today and Facebook will automatically identify your friends in the image and tag them for you. This is yet another example of supervised learning. If you ever used the early tagging system and manually identified your friends in an image, you were in fact helping to create Facebook's labeled dataset. A user who uploaded an image of a person's face (the input d...

Table des matiĂšres