eBook - ePub

Automated Machine Learning with Microsoft Azure

Name: Automated Machine Learning with Microsoft Azure
Author: Dennis Michael Sawyers

Dennis Michael Sawyers

Share book

340 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Automated Machine Learning with Microsoft Azure

Dennis Michael Sawyers

Book details

Book preview

Table of contents

Citations

About This Book

A practical, step-by-step guide to using Microsoft's AutoML technology on the Azure Machine Learning service for developers and data scientists working with the Python programming language

Key Features

Create, deploy, productionalize, and scale automated machine learning solutions on Microsoft Azure
Improve the accuracy of your ML models through automatic data featurization and model training
Increase productivity in your organization by using artificial intelligence to solve common problems

Book Description

Automated Machine Learning with Microsoft Azure will teach you how to build high-performing, accurate machine learning models in record time. It will equip you with the knowledge and skills to easily harness the power of artificial intelligence and increase the productivity and profitability of your business.

Guided user interfaces (GUIs) enable both novices and seasoned data scientists to easily train and deploy machine learning solutions to production. Using a careful, step-by-step approach, this book will teach you how to use Azure AutoML with a GUI as well as the AzureML Python software development kit (SDK).

First, you'll learn how to prepare data, train models, and register them to your Azure Machine Learning workspace. You'll then discover how to take those models and use them to create both automated batch solutions using machine learning pipelines and real-time scoring solutions using Azure Kubernetes Service (AKS).

Finally, you will be able to use AutoML on your own data to not only train regression, classification, and forecasting models but also use them to solve a wide variety of business problems.

By the end of this Azure book, you'll be able to show your business partners exactly how your ML models are making predictions through automatically generated charts and graphs, earning their trust and respect.

What you will learn

Understand how to train classification, regression, and forecasting ML algorithms with Azure AutoML
Prepare data for Azure AutoML to ensure smooth model training and deployment
Adjust AutoML configuration settings to make your models as accurate as possible
Determine when to use a batch-scoring solution versus a real-time scoring solution
Productionalize your AutoML and discover how to quickly deliver value
Create real-time scoring solutions with AutoML and Azure Kubernetes Service
Train a large number of AutoML models at once using the AzureML Python SDK

Who this book is for

Data scientists, aspiring data scientists, machine learning engineers, or anyone interested in applying artificial intelligence or machine learning in their business will find this machine learning book useful.

You need to have beginner-level knowledge of artificial intelligence and a technical background in computer science, statistics, or information technology before getting started. Familiarity with Python will help you implement the more advanced features found in the chapters, but even data analysts and SQL experts will be able to train ML models after finishing this book.

Frequently asked questions

How do I cancel my subscription?

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

Can/how do I download books?

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

What is the difference between the pricing plans?

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

What is Perlego?

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Do you support text-to-speech?

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Is Automated Machine Learning with Microsoft Azure an online PDF/ePUB?

Yes, you can access Automated Machine Learning with Microsoft Azure by Dennis Michael Sawyers in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Modelling & Design. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2021

ISBN

9781800561977

Edition

Topic

Computer Science

Subtopic

Data Modelling & Design

Index

Computer Science

Section 1: AutoML Explained – Why, What, and How

In this first part, you will understand why you should use AutoML and how it solves common industry problems. You will also build an AutoML solution through a UI.

This section comprises the following chapters:

Chapter 1, Introducing AutoML
Chapter 2, Getting Started with Azure Machine Learning Service
Chapter 3, Training Your First AutoML Model

Chapter 1: Introducing AutoML

AI is everywhere. From recommending products on your favorite websites to optimizing the supply chains of Fortune 500 companies to forecasting demand for shops of all sizes, AI has emerged as a dominant force. Yet, as AI becomes more and more prevalent in the workplace, a worrisome trend has emerged: most AI projects fail.

Failure occurs for a variety of technical and non-technical reasons. Sometimes, it's because the AI model performs poorly. Other times, it's due to data issues. Machine learning algorithms require reliable, accurate, timely data, and sometimes your data fails to meet those standards. When data isn't the issue and your model performs well, failure usually occurs because end users simply do not trust AI to guide their decision making.

For every worrisome trend, however, there is a promising solution. Microsoft and a host of other companies have developed automated machine learning (AutoML) to increase the success of your AI projects. In this book, you will learn how to use AutoML on Microsoft's Azure cloud platform. This book will teach you how to boost your productivity if you are a data scientist. If you are not a data scientist, this book will enable you to build machine learning models and harness the power of AI.

In this chapter, we will begin by understanding what AI and machine learning are and explain why companies have had such trouble in seeing a return on their investment in AI. Then, we will proceed into a deeper dive into how data scientists work and why that workflow is inherently slow and mistake-prone from a project success perspective. Finally, we conclude the chapter by introducing AutoML as the key to unlocking productivity in machine learning projects.

In this chapter, we will cover the following topics:

Explaining data science's ROI problem
Analyzing why AI projects fail slowly
Solving the ROI problem with AutoML

Explaining data science's ROI problem

Data scientist has been consistently ranked the best job in America by Forbes Magazine from 2016 to 2019, yet the best job in America has not produced the best results for the companies employing them. According to VentureBeat, 87% of data science projects fail to make it into production. This means that most of the work that data scientists perform does not impact their employer in any meaningful way.

By itself, this is not a problem. If data scientists were cheap and plentiful, companies would see a return on their investment. However, this is simply not the case. According to the 2020 LinkedIn Salary stats, data scientists earn a total compensation of around $111,000 across all career levels in the United States. It's also very easy for them to find jobs.

Burtch Works, a United States-based executive recruiting firm, reports that, as of 2018, data scientists stayed at their job for only 2.6 years on average, and 17.6% of all data scientists changed jobs that year. Data scientists are expensive and hard to keep.

Likewise, if data scientists worked fast, even though 87% of their projects fail to have an impact, a return on investment (ROI) is still possible. Failing fast means that many projects still make it into production and the department is successful. Failing slow means that the department fails to deliver.

Unfortunately, most data science departments fail slow. To understand why, you must first understand what machine learning is, how it differs from traditional software development, and the five steps common to all machine learning projects.

Defining machine learning, data science, and AI

Machine learning is the process of training statistical models to make predictions using data. It is a category within AI. AI is defined as computer programs that perform cognitive tasks such as decision making that would normally be performed by a human. Data science is a career field that combines computer science, machine learning, and other statistical techniques to solve business problems.

Data scientists use a variety of machine learning algorithms to solve business problems. Machine learning algorithms are best thought of as a defined set of mathematical computations to perform on data to make predictions. Common applications of machine learning that you may experience in everyday life include predicting when your credit card was used to make a fraudulent transaction, determining how much money you should be given when applying for a loan, and figuring out which items are suggested to you when shopping online. All of these decisions, big and small, are determined mechanistically through machine learning.

There are many types of algorithms, but it's not important for you to know them all. Random Forest, XGBoost, LightGBM, deep learning, CART decision trees, multilinear regression, naïve Bayes, logistic regression, and k-nearest neighbor are all examples of machine learning algorithms. These algorithms are powerful because they work by learning patterns in data that would be too complex or subtle for any human being to detect on their own.

What is important for you to know is the difference between supervised learning and unsupervised learning. Supervised learning uses historical, labeled data to make future predictions.

Imagine you are a restaurant manager and you want to forecast how much money you will make next month by running an advertising campaign. To accomplish this with machine learning, you would want to collect all of your sales data from previous years, including the results of previous campaigns. Since you have past results and are using them to make predictions, this is an example of supervised learning.

Unsupervised learning simply groups like data points together. It's useful when you have a lot of information about your customers and would like to group them into buckets so that you can advertise to them in a more targeted fashion. Azure AutoML, however, is strictly for supervised learning tasks. Thus, you always need to have past results available in your data when creating new AutoML models.

Machine learning versus traditional software

Traditional software development and machine learning development differ tremendously. Programmers are used to creating software that takes in input and delivers output based on explicitly defined rules. Data scientists, on the other hand, collect the desired output first before making a program. They then use this output data along with input data to create a program that learns how to predict output from input.

For example, maybe you would like to build an algorithm predicting how many car accidents would occur in a given city on a given day. First, you would begin by collecting historical data such as the number of car crashes (the desired output) and any data that you guess would be useful in predicting that number (input data). Weather data, day of the week, amount of traffic, and data related to city events can all be used as input.

Once you collect the data, your next step is to create a statistical program that finds hidden patterns between the input and output data; this is called model training. After you train your model, your next step is to set up an inference program that uses new input data to predict how many car accidents will happen that day using your trained model.

Another major difference is that, with machine learning, you never know what data you're going to need to create your solution before you try it out, and you never know what you're going to get until you build a solution. Since data scientists never know what data they need to solve any given problem, they need to ask for advice from business experts and use their intuition to identify the right data to collect.

These differences are important because successful machine learning projects look very different from successful traditional software projects; confusing the two leads to failed projects. Managers with an IT background but lacking a data science background often try to follow methods and timelines inappropriate for a machine learning project.

Frankly, it's unrealistic to assign hard timelines to a process where you don't know what data you will need or what algorithms will work, and many data science projects fail simply because they weren't given adequate time or support. There is, however, a recipe for success.

The five steps to machine learning success

Now that we know what machine learning is and how it differs from traditional software development, the next step is to learn how a typical machine learning project is structured. There are many ways you could divide the process, but there are roughly five par...