eBook - ePub

Machine Learning with Scala Quick Start Guide

Name: Machine Learning with Scala Quick Start Guide
ISBN: 9781789345414

Leverage popular machine learning algorithms and techniques and implement them in Scala

Md. Rezaul Karim,

220 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Machine Learning with Scala Quick Start Guide

Leverage popular machine learning algorithms and techniques and implement them in Scala

Md. Rezaul Karim,

About this book

Supervised and unsupervised machine learning made easy in Scala with this quick-start guide.

Key Features

Construct and deploy machine learning systems that learn from your data and give accurate predictions
Unleash the power of Spark ML along with popular machine learning algorithms to solve complex tasks in Scala.
Solve hands-on problems by combining popular neural network architectures such as LSTM and CNN using Scala with DeepLearning4j library

Book Description

Scala is a highly scalable integration of object-oriented nature and functional programming concepts that make it easy to build scalable and complex big data applications. This book is a handy guide for machine learning developers and data scientists who want to develop and train effective machine learning models in Scala.

The book starts with an introduction to machine learning, while covering deep learning and machine learning basics. It then explains how to use Scala-based ML libraries to solve classification and regression problems using linear regression, generalized linear regression, logistic regression, support vector machine, and Naïve Bayes algorithms.

It also covers tree-based ensemble techniques for solving both classification and regression problems. Moving ahead, it covers unsupervised learning techniques, such as dimensionality reduction, clustering, and recommender systems. Finally, it provides a brief overview of deep learning using a real-life example in Scala.

What you will learn

Get acquainted with JVM-based machine learning libraries for Scala such as Spark ML and Deeplearning4j
Learn RDDs, DataFrame, and Spark SQL for analyzing structured and unstructured data
Understand supervised and unsupervised learning techniques with best practices and pitfalls
Learn classification and regression analysis with linear regression, logistic regression, Naïve Bayes, support vector machine, and tree-based ensemble techniques
Learn effective ways of clustering analysis with dimensionality reduction techniques
Learn recommender systems with collaborative filtering approach
Delve into deep learning and neural network architectures

Who this book is for

This book is for machine learning developers looking to train machine learning models in Scala without spending too much time and effort. Some fundamental knowledge of Scala programming and some basics of statistics and linear algebra is all you need to get started with this book.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Machine Learning with Scala Quick Start Guide by Md. Rezaul Karim in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Science General. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Year

Print ISBN

eBook ISBN

Edition

Topic

Computer Science

Subtopic

Computer Science General

Index

Computer Science

Introduction to Deep Learning with Scala

Throughout Chapter 2, Scala for Regression Analysis, to Chapter 6, Scala for Recommender System, we have learned about linear and classic machine learning (ML) algorithms through real-life examples. In this chapter, we will explain some basic concepts of deep learning (DL). We will start with DL, which is one of the emerging branches of ML. We will briefly discuss some of the most well-known and widely used neural network architectures and DL frameworks and libraries.

Finally, we will use the Long Short-Term Memory (LSTM) architecture for cancer type classification from a very high-dimensional dataset curated from The Cancer Genome Atlas (TCGA). The following topics will be covered in this chapter:

DL versus ML
DL and neural networks
Deep neural network architectures
DL frameworks
Getting started with learning

Technical requirements

Make sure Scala 2.11.x and Java 1.8.x are installed and configured on your machine.

The code files of this chapters can be found on GitHub:

https://github.com/PacktPublishing/Machine-Learning-with-Scala-Quick-Start-Guide/tree/master/Chapter07

Check out the following video to see the Code in Action:
http://bit.ly/2vwrxzb

DL versus ML

Simple ML methods that were used in small-scale data analysis are not effective anymore because the effectiveness of ML methods diminishes with large and high-dimensional datasets. Here comes DL—a branch of ML based on a set of algorithms that attempt to model high-level abstractions in data. Ian Goodfellow et al. (Deep Learning, MIT Press, 2016) defined DL as follows:

"Deep learning is a particular kind of machine learning that achieves great power and flexibility by learning to represent the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts, and more abstract representations computed in terms of less abstract ones."

Similar to the ML model, a DL model also takes in an input, X, and learns high-level abstractions or patterns from it to predict an output of Y. For example, based on the stock prices of the past week, a DL model can predict the stock price for the next day. When performing training on such historical stock data, a DL model tries to minimize the difference between the prediction and the actual values. This way, a DL model tries to generalize to inputs that it hasn't seen before and makes predictions on test data.

Now, you might be wondering, if an ML model can do the same tasks, why do we need DL for this? Well, DL models tend to perform well with large amounts of data, whereas old ML models stop improving after a certain point. The core concept of DL is inspired by the structure and function of the brain, which are called artificial neural networks (ANNs). Being at the core of DL, ANNs help you learn the associations between sets of inputs and outputs in order to make more robust and accurate predictions. However, DL is not only limited to ANNs; there have been many theoretical advances, software stacks, and hardware improvements that bring DL to the masses. Let's look at an example; suppose we want to develop a predictive analytics model, such as an animal recognizer, where our system has to resolve two problems:

To classify whether an image represents a cat or a dog
To cluster images of dogs and cats

If we solve the first problem using a typical ML method, we must define the facial features (ears, eyes, whiskers, and so on) and write a method to identify which features (typically nonlinear) are more important when classifying a particular animal.

However, at the same time, we cannot address the second problem because classical ML algorithms for clustering images (such as k-means) cannot handle nonlinear features. Take a look at the following diagram, which shows a workflow that we would follow whether we wanted to classify if the given image is of a cat:

DL algorithms will take these two problems one step further, and the most important features will be extracted automatically after determining which features are the most important for classification or clustering. In contrast, when using a classical ML algorithm, we would have to provide the features manually.

A DL algorithm would take more sophisticated steps instead. For example, first, it would identify the edges that are the most relevant when clustering cats or dogs. It would then try to find various combinations of shapes and edges hierarchically. This step is called extract, transform, and load (ETL). Then after several iterations, hierarchical identification of complex concepts and features would be carried out. Then, based on the identified features, the DL algorithm would decide which of these features are most significant for classifying the animal. This step is known as feature extraction. Finally, it would take out the label column and perform unsupervised training using autoencoders (AEs) to extract the latent features to be redistributed to k-means for clustering. Then, the clustering assignment hardening loss (CAH loss) and reconstruction loss are jointly optimized toward optimal clustering assignment.

However, in practice, a DL algorithm is fed with a raw image representations, which doesn't see an image as we see it because it only knows the position of each pixel and its color. The image is divided into various layers of analysis. At a lower level, the software analyzes, for example, a grid of a few pixels with the task of detecting a type of color or various nuances. If it finds something, it informs the next level, which at this point checks whether or not that given color belongs to a larger form, such as a line.

The process continues to the upper levels until the algorithm understand what is shown in the following diagram:

Although dog versus cat is an example of a very simple classifier, software t...

Title Page
Copyright and Credits
About Packt
Contributors
Preface
Introduction to Machine Learning with Scala
Scala for Regression Analysis
Scala for Learning Classification
Scala for Tree-Based Ensemble Techniques
Scala for Dimensionality Reduction and Clustering
Scala for Recommender System
Introduction to Deep Learning with Scala
Other Books You May Enjoy

About this book

Frequently asked questions

Information

Table of contents