Chapter 1: Introduction to Machine Learning
Introduction
Supervised Learning
Unsupervised Learning
Semisupervised Learning and Reinforcement Learning
Supervised Learning Predictions
Decision Prediction
Ranking Prediction
Estimation Prediction
Model Building and Selection
Model Complexity
Introducing Model Studio
Demo 1.1: Creating a Project and Loading Data
Model Studio: Analysis Elements
Demo 1.2: Building a Pipeline from a Basic Template
Quiz
Introduction
There are two main types of machine learning methods, supervised learning and unsupervised learning.
Supervised Learning
Supervised learning (also known as predictive modeling) starts with a training data set. The observations in a training data set are known as training cases (also known as examples, instances, or records). The variables are called inputs (also known as predictors, features, explanatory variables, or independent variables) and targets (also known as responses, outcomes, or dependent variables). The learning algorithm receives a set of inputs along with the corresponding correct outputs or targets, and the algorithm learns by comparing its actual output with correct outputs to find errors. It then modifies the model accordingly. Through methods like classification, regression, prediction, and gradient boosting, supervised learning uses patterns to predict the values of the label on additional unlabeled data. In other words, the purpose of the training data is to generate a predictive model. The predictive model is a concise representation of the association between the inputs and the target variables.
Supervised learning is commonly used in applications where historical data predicts likely future events. For example, it can anticipate when credit card transactions are likely to be fraudulent or which insurance customer is likely to file a claim.
Unsupervised Learning
Unsupervised learning is used against data that has no historical labels. In other words, the system is not told the âright answerâ â there is no target data â the algorithm must figure out what is being shown. The goal is to explore the data and find some structure or pattern. Unsupervised learning works well on transactional data. For example, it can identify segments of customers with similar attributes who can then be treated similarly in marketing campaigns. Or it can find the main attributes that separate customer segments from each other. Popular techniques include self-organizing maps, nearest-neighbor mapping, k-means clustering, and singular value decomposition. These algorithms are also used to segment text topics, recommend items, and identify data outliers.
Semisupervised Learning and Reinforcement Learning
Other common methods include semisupervised learning and reinforcement learning. Semisupervised learning is used for similar applications as supervised learning. But it uses both labeled and unlabeled data for training â typically a small amount of labeled data with a large amount of unlabeled data (because unlabeled data is less expensive and takes less effort to acquire). This type of learning can be...