Section 1: Introduction to Azure Machine Learning
In this section, we will learn about the history of Machine Learning (ML), the scenarios in which to apply ML, the statistical knowledge necessary, and the steps and components required for running a custom end-to-end ML project. We will have a look at the available Azure services for ML and we will learn about the scenarios they are best suited for. Finally, we will introduce Azure Machine Learning, the main service we will utilize throughout the rest of the book. We will understand how to deploy this service and use it to run our first ML experiments in the cloud.
This section comprises the following chapters:
- Chapter 1, Understanding the End-to-End Machine Learning Process
- Chapter 2, Choosing the Right Machine Learning Service in Azure
- Chapter 3, Preparing the Azure Machine Learning Workspace
Chapter 1: Understanding the End-to-End Machine Learning Process
Welcome to the second edition of Mastering Azure Machine Learning. In this first chapter, we want to give you an understanding of what kinds of problems require the use of machine learning (ML), how the full ML process unfolds, and what knowledge is required to navigate this vast terrain. You can view it as an introduction to ML and an overview of the book itself, where for most topics we will provide you with a reference to upcoming chapters so that you can easily find your way around the book.
In the first section, we will ask ourselves what ML is, when we should use it, and where it comes from. In addition, we will reflect on how ML is just another form of programming.
In the second section, we will lay the mathematical groundwork you require to process data, and we will understand that the data you work with probably cannot be fully trusted. Further, we will look at different classes of ML algorithms, how they are defined, and how we can define the performance of a trained model.
Finally, in the third section, we will have a look at the end-to-end process of an ML project. We will understand where to get data from, how to preprocess data, how to choose a fitting model, and how to deploy this model into production environments. This will also get us into the topic of ML operations, known as MLOps.
In this chapter, we will cover the following topics:
- Grasping the idea behind ML
- Understanding the mathematical basis for statistical analysis and ML modeling
- Discovering the end-to-end ML process
Grasping the idea behind ML
The terms artificial intelligence (AI) and—partially—ML are omnipresent in today's world. However, a lot of what is found under the term AI is often nothing more than a containerized ML solution, and to make matters worse, ML is sometimes unnecessarily used to solve something extremely simple.
Therefore, in this first section, let's understand the class of problems ML tries to solve, in which scenarios to use ML, and when not to use it.
Problems and scenarios requiring ML
If you look for a definition of ML, you will often find a description such as this: It is the study of self-improving machine algorithms using data. ML is basically described as an algorithm we are trying to evolve, which in turn can be seen as one complex mathematical function.
Any computer process today follows the simple structure of the input-process-output (IPO) model. We define allowed inputs, we define a process working with those inputs, and we define an output through the type of results the process will show us. A simple example would be a word processing application, where every keystroke will result in a letter shown as the output on the screen. A completely different process might run in parallel to that one, having a time-based trigger to store the text file periodically to a hard disk.
All these processes or algorithms have one thing in common—they were manually written by someone using a high-level programming language. It is clear which actions need to be done when someone presses a letter in a word processing application. Therefore, we can easily build a process in which we implement which input values should create which output values.
Now, let's look at a more complex problem. Imagine we have a picture of a dog and want an application to just say:
This is a dog. This sounds simple enough, as we know the input
picture of a dog and the output value
dog. Unfortunately, our brain (our own machine) is far superior to the machines we built, especially when it comes to pattern recognition. For a computer, a picture is just a square of
pixels, each containing three color channels defined by an 8-bit or 10-bit value. Therefore, an image is just a bunch of pixels made up of vectors for the computer, so in essence, a lot of numbers.
We could manually start writing an algorithm that maybe clusters groups of pixels, looks for edges and points of interest, and eventually, with a lot of effort, we might succeed in having an algorithm that finds dogs in pictures. That is when we get a picture of a cat.
It should be clear to you by now that we might run into a problem. Therefore, let's define one problem that ML solves, as follows:
Building the desired algorithm for a required solution programmatically is either extremely time-consuming, completely unfeasible, or impossible.
Taking this description, we can surely define good scenarios to use ML, be it finding objects in images and videos or understanding voices and extracting their intent from audio files. We will further understand what building ML solutions entails throughout this chapter (and the rest of the book, for that matter), but to make a simple statement, let's just acknowledge that building an ML model is also a time-consuming matter.
In that vein, it should be of utmost importance to avoid ML if we have the chance to do so. This might be an obvious statement, but as we (the authors) can attest, it is not for a lot of people. We have seen projects realized with ML where the output could be defined with a simple combination of if statements given some input vectors. In such scenarios, a solution could be obtained with a couple of hundred lines of code. Instead, months of training and testing an ML algorithm occurred, costing a lot of time and resources.
An example of this would be a company wanting to predict fraud (stolen money) committed by their own employees in a retail store. You might have heard that predicting fraud is a typical scenario for ML. Here, it was not necessary to use ML, as the company already knew the influencing factors (length of time the cashier was open, error codes on return receipts, and so on) and therefore wanted to be alerted when certain combinations of these factors occurred. As they knew the factors already, they could have just written the code and be done with it. But what does this scenario tell us about ML?
So far, we have looked at ML as a solution to solve a problem that, in essen...