eBook - ePub

Machine Learning Bookcamp

Name: Machine Learning Bookcamp
Author: Alexey Grigorev

Build a portfolio of real-life projects

Alexey Grigorev,

472 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Machine Learning Bookcamp

Build a portfolio of real-life projects

Alexey Grigorev,

About this book

Time to flex your machine learning muscles! Take on the carefully designed challenges of the Machine Learning Bookcamp and master essential ML techniques through practical application. Summary
In Machine Learning Bookcamp you will: Collect and clean data for training models
Use popular Python tools, including NumPy, Scikit-Learn, and TensorFlow
Apply ML to complex datasets with images
Deploy ML models to a production-ready environment The only way to learn is to practice! In Machine Learning Bookcamp, you'll create and deploy Python-based machine learning models for a variety of increasingly challenging projects. Taking you from the basics of machine learning to complex applications such as image analysis, each new project builds on what you've learned in previous chapters. You'll build a portfolio of business-relevant machine learning projects that hiring managers will be excited to see. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology
Master key machine learning concepts as you build actual projects! Machine learning is what you need for analyzing customer behavior, predicting price trends, evaluating risk, and much more. To master ML, you need great examples, clear explanations, and lots of practice. This book delivers all three! About the book
Machine Learning Bookcamp presents realistic, practical machine learning scenarios, along with crystal-clear coverage of key concepts. In it, you'll complete engaging projects, such as creating a car price predictor using linear regression and deploying a churn prediction service. You'll go beyond the algorithms and explore important techniques like deploying ML applications on serverless systems and serving models with Kubernetes and Kubeflow. Dig in, get your hands dirty, and have fun building your ML skills! What's inside Collect and clean data for training models
Use popular Python tools, including NumPy, Scikit-Learn, and TensorFlow
Deploy ML models to a production-ready environmentAbout the reader
Python programming skills assumed. No previous machine learning knowledge is required. About the author
Alexey Grigorev is a principal data scientist at OLX Group. He runs DataTalks.Club, a community of people who love data.Table of Contents 1 Introduction to machine learning
2 Machine learning for regression
3 Machine learning for classification
4 Evaluation metrics for classification
5 Deploying machine learning models
6 Decision trees and ensemble learning
7 Neural networks and deep learning
8 Serverless deep learning
9 Serving models with Kubernetes and Kubeflow

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Machine Learning Bookcamp by Alexey Grigorev in PDF and/or ePUB format, as well as other popular books in Ciencia de la computación & Tratamiento de datos. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Year

Print ISBN

eBook ISBN

Topic

Ciencia de la computación

Subtopic

Tratamiento de datos

1 Introduction to machine learning

This chapter covers

Understanding machine learning and the problems it can solve
Organizing a successful machine learning project
Training and selecting machine learning models
Performing model validation

In this chapter, we introduce machine learning and describe the cases in which it’s most helpful. We show how machine learning projects are different from traditional software engineering (rule-based solutions) and illustrate the differences by using a spam-detection system as an example.

To use machine learning to solve real-life problems, we need a way to organize machine learning projects. In this chapter, we talk about CRISP-DM: a step-by-step methodology for implementing successful machine learning projects.

Finally, we take a closer look at one of the steps of CRISP-DM—the modeling step. In this step, we train different models and select the one that solves our problem best.

1.1 Machine learning

Machine learning is part of applied mathematics and computer science. It uses tools from mathematical disciplines such as probability, statistics, and optimization theory to extract patterns from data.

The main idea behind machine learning is learning from examples: we prepare a dataset with examples, and a machine learning system “learns” from this dataset. In other words, we give the system the input and the desired output, and the system tries to figure out how to do the conversion automatically, without asking a human.

We can collect a dataset with descriptions of cars and their prices, for example. Then we provide a machine learning model with this dataset and “teach” it by showing it cars and their prices. This process is called training or sometimes fitting (figure 1.1).

Figure 1.1 A machine learning algorithm takes in input data (descriptions of cars) and desired output (the cars’ prices). Based on that data, it produces a model.

When training is done, we can use the model by asking it to predict car prices that we don’t know yet (figure 1.2).

Figure 1.2 When training is done, we have a model that can be applied to new input data (cars without prices) to produce the output (predictions of prices).

All we need for machine learning is a dataset in which for each input item (a car) we have the desired output (the price).

This process is quite different from traditional software engineering. Without machine learning, analysts and developers look at the data they have and try to find patterns manually. After that, they come up with some logic: a set of rules for converting the input data to the desired output. Then they explicitly encode these rules using a programming language such as Java or Python, and the result is called software. So, in contrast with machine learning, a human does all the difficult work (figure 1.3).

Figure 1.3 In traditional software, patterns are discovered manually and then encoded with a programming language. A human does all the work.

In summary, the difference between a traditional software system and a system based on machine learning is shown in figure 1.4. In machine learning, we give the system the input and output data, and the result is a model (code) that can transform the input into the output. The difficult work is done by the machine; we need only supervise the training process to make sure that the model is good (figure 1.4B). In contrast, in traditional systems, we first find the patterns in the data ourselves and then write code that converts the data to the desired outcome, using the manually discovered patterns (figure 1.4A).

	(A) In traditional software we discover patterns manually and encode them using a programming language.
	(B) A machine learning system discovers patterns automatically by learning from examples. After training, it produces a model that “knows” these patterns, but we still need to supervise it to make sure the model is correct.

Figure 1.4 The difference between a traditional software system and a machine learning system. In traditional software engineering, we do all the work, whereas in machine learning, we delegate pattern discovery to a machine.

1.1.1 Machine learning vs. rule-based systems

To illustrate the difference between these two approaches and to show why machine learning is helpful, let’s consider a concrete case. In this section, we talk about a spam-detection system to show this difference.

Suppose we are running an email service, and the users start complaining about unsolicited emails with advertisements. To solve this problem, we want to create a system that marks the unwanted messages as spam and forwards them to the spam folder.

The obvious way to solve the problem is to look at these emails ourselves to see whether they have any pattern. For example, we can check the sender and the content.

If we find that there’s indeed a pattern in the spam messages, we write down the discovered patterns and come up with following two simple rules to catch these messages:

If sender = [email protected], then “spam”
If title contains “buy now 50% off” and sender domain is “online.com,” then “spam”
Otherwise, “good email”

We write these rules in Python and create a spam-detection service, which we successfully deploy. At the beginning, the system works well and catches all the spam, but after a while, new spam messages start to slip through. The rules we have are no longer successful at marking these messages as spam.

To solve the problem, we analyze the content of the new messages and find that most of them contain the word deposit. So we add a new rule:

If sender = “[email protected]” then “spam”
If title contains “buy now 50% off” and sender domain is “online.com,” then “spam”
If body contains a word “deposit,” then “spam”
Otherwise, “good email”

After discovering this rule, we deploy the fix to our Python service and start catching more spam, making the users of our mail system happy.

Some time later, however, users start complaining again: some people use the word deposit with good intentions, but our system fails to recognize that fact and marks the messages as...

inside front cover
Machine Learning Bookcamp
Copyright
brief contents
contents
front matter
1 Introduction to machine learning
2 Machine learning for regression
3 Machine learning for classification
4 Evaluation metrics for classification
5 Deploying machine learning models
6 Decision trees and ensemble learning
7 Neural networks and deep learning
8 Serverless deep learning
9 Serving models with Kubernetes and Kubeflow
Appendix A. Preparing the environment
Appendix B. Introduction to Python
Appendix C. Introduction to NumPy
Appendix D. Introduction to Pandas
Appendix E. AWS SageMaker
index

9781789958294,