Machine Learning Bookcamp
eBook - ePub

Machine Learning Bookcamp

Build a portfolio of real-life projects

Alexey Grigorev

Condividi libro
  1. 472 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Machine Learning Bookcamp

Build a portfolio of real-life projects

Alexey Grigorev

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

Time to flex your machine learning muscles! Take on the carefully designed challenges of the Machine Learning Bookcamp and master essential ML techniques through practical application. Summary
In Machine Learning Bookcamp you will: Collect and clean data for training models
Use popular Python tools, including NumPy, Scikit-Learn, and TensorFlow
Apply ML to complex datasets with images
Deploy ML models to a production-ready environment The only way to learn is to practice! In Machine Learning Bookcamp, you'll create and deploy Python-based machine learning models for a variety of increasingly challenging projects. Taking you from the basics of machine learning to complex applications such as image analysis, each new project builds on what you've learned in previous chapters. You'll build a portfolio of business-relevant machine learning projects that hiring managers will be excited to see. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology
Master key machine learning concepts as you build actual projects! Machine learning is what you need for analyzing customer behavior, predicting price trends, evaluating risk, and much more. To master ML, you need great examples, clear explanations, and lots of practice. This book delivers all three! About the book
Machine Learning Bookcamp presents realistic, practical machine learning scenarios, along with crystal-clear coverage of key concepts. In it, you'll complete engaging projects, such as creating a car price predictor using linear regression and deploying a churn prediction service. You'll go beyond the algorithms and explore important techniques like deploying ML applications on serverless systems and serving models with Kubernetes and Kubeflow. Dig in, get your hands dirty, and have fun building your ML skills! What's inside Collect and clean data for training models
Use popular Python tools, including NumPy, Scikit-Learn, and TensorFlow
Deploy ML models to a production-ready environmentAbout the reader
Python programming skills assumed. No previous machine learning knowledge is required. About the author
Alexey Grigorev is a principal data scientist at OLX Group. He runs DataTalks.Club, a community of people who love data.Table of Contents 1 Introduction to machine learning
2 Machine learning for regression
3 Machine learning for classification
4 Evaluation metrics for classification
5 Deploying machine learning models
6 Decision trees and ensemble learning
7 Neural networks and deep learning
8 Serverless deep learning
9 Serving models with Kubernetes and Kubeflow

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Machine Learning Bookcamp è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Machine Learning Bookcamp di Alexey Grigorev in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Informatique e Traitement des données. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Editore
Manning
Anno
2021
ISBN
9781638351054

1 Introduction to machine learning

This chapter covers
  • Understanding machine learning and the problems it can solve
  • Organizing a successful machine learning project
  • Training and selecting machine learning models
  • Performing model validation
In this chapter, we introduce machine learning and describe the cases in which it’s most helpful. We show how machine learning projects are different from traditional software engineering (rule-based solutions) and illustrate the differences by using a spam-detection system as an example.
To use machine learning to solve real-life problems, we need a way to organize machine learning projects. In this chapter, we talk about CRISP-DM: a step-by-step methodology for implementing successful machine learning projects.
Finally, we take a closer look at one of the steps of CRISP-DM—the modeling step. In this step, we train different models and select the one that solves our problem best.

1.1 Machine learning

Machine learning is part of applied mathematics and computer science. It uses tools from mathematical disciplines such as probability, statistics, and optimization theory to extract patterns from data.
The main idea behind machine learning is learning from examples: we prepare a dataset with examples, and a machine learning system “learns” from this dataset. In other words, we give the system the input and the desired output, and the system tries to figure out how to do the conversion automatically, without asking a human.
We can collect a dataset with descriptions of cars and their prices, for example. Then we provide a machine learning model with this dataset and “teach” it by showing it cars and their prices. This process is called training or sometimes fitting (figure 1.1).
Figure 1.1 A machine learning algorithm takes in input data (descriptions of cars) and desired output (the cars’ prices). Based on that data, it produces a model.
When training is done, we can use the model by asking it to predict car prices that we don’t know yet (figure 1.2).
Figure 1.2 When training is done, we have a model that can be applied to new input data (cars without prices) to produce the output (predictions of prices).
All we need for machine learning is a dataset in which for each input item (a car) we have the desired output (the price).
This process is quite different from traditional software engineering. Without machine learning, analysts and developers look at the data they have and try to find patterns manually. After that, they come up with some logic: a set of rules for converting the input data to the desired output. Then they explicitly encode these rules using a programming language such as Java or Python, and the result is called software. So, in contrast with machine learning, a human does all the difficult work (figure 1.3).
Figure 1.3 In traditional software, patterns are discovered manually and then encoded with a programming language. A human does all the work.
In summary, the difference between a traditional software system and a system based on machine learning is shown in figure 1.4. In machine learning, we give the system the input and output data, and the result is a model (code) that can transform the input into the output. The difficult work is done by the machine; we need only supervise the training process to make sure that the model is good (figure 1.4B). In contrast, in traditional systems, we first find the patterns in the data ourselves and then write code that converts the data to the desired outcome, using the manually discovered patterns (figure 1.4A).

(A) In traditional software we discover patterns manually and encode them using a programming language.
(B) A machine learning system discovers patterns automatically by learning from examples. After training, it produces a model that “knows” these patterns, but we still need to supervise it to make sure the model is correct.
Figure 1.4 The difference between a traditional software system and a machine learning system. In traditional software engineering, we do all the work, whereas in machine learning, we delegate pattern discovery to a machine.

1.1.1 Machine learning vs. rule-based systems

To illustrate the difference between these two approaches and to show why machine learning is helpful, let’s consider a concrete case. In this section, we talk about a spam-detection system to show this difference.
Suppose we are running an email service, and the users start complaining about unsolicited emails with advertisements. To solve this problem, we want to create a system that marks the unwanted messages as spam and forwards them to the spam folder.
The obvious way to solve the problem is to look at these emails ourselves to see whether they have any pattern. For example, we can check the sender and the content.
If we find that there’s indeed a pattern in the spam messages, we write down the discovered patterns and come up with following two simple rules to catch these messages:
  • If sender = [email protected], then “spam”
  • If title contains “buy now 50% off” and sender domain is “online.com,” then “spam”
  • Otherwise, “good email”
We write these rules in Python and create a spam-detection service, which we successfully deploy. At the beginning, the system works well and catches all the spam, but after a while, new spam messages start to slip through. The rules we have are no longer successful at marking these messages as spam.
To solve the problem, we analyze the content of the new messages and find that most of them contain the word deposit. So we add a new rule:
  • If sender = “[email protected]” then “spam”
  • If title contains “buy now 50% off” and sender domain is “online.com,” then “spam”
  • If body contains a word “deposit,” then “spam”
  • Otherwise, “good email”
After discovering this rule, we deploy the fix to our Python service and start catching more spam, making the users of our mail system happy.
Some time later, however, users start complaining again: some people use the word deposit with good intentions, but our system fails to recognize that fact and marks the messages as...

Indice dei contenuti