1 Deep Learning â A State-of-the-Art Approach to Artificial Intelligence
Soumyajit Goswami IBM India Private Limited, Salt Lake, Sector V, Kolkata, West Bengal, India
Abstract
This chapter presents various cloud platforms that are available in market offerings from different vendors. IBM provided a machine learning (ML) platform âIBM Watson Studioâ (formerly âData Science Experienceâ), and this is considered here for the field of study. An overview of artificial intelligence, ML, and deep learning (DL) with their relationship is deliberated. Discussion on popular DL architectures with elementary comparison is also considered.
Keywords: Artificial intelligence, machine learning, deep learning, IBM Watson Studio (formerly, Data Science Experience or DSX),
1.1 Introduction
Deep learning (DL), the subfield of artificial intelligence (AI), is the most promising area considered for research and industry. Although DL is a very modern topic, it is already being used by multiple technology giants to fulfill their needs. Few examples are voice and image recognition algorithms of Google [1]: Netflix and Amazon use it to decide [2] which video a person desires to watch or purchase in near future, upcoming forecast by MIT researchers [3], and Facebook uses it to predict future actions for advertisers [4]. UCLA researchers have manufactured an advanced microscope that produces a high-dimensional dataset used to train a DL network in identifying cancer cells in tissue samples [5]. In a nutshell, it has been used nowadays everywhere whenever automation comes into picture.
In the following section of this chapter, the relationship between DL, machine learning (ML), and AI has been discussed. A brief introduction to artificial neural network (ANN) with its classification and its different learning techniques has been specified in Section 1.3. As part of classification of ANN, feedforward neural networks (FFNNs) and recurrent neural networks (RNNs) with their uses have been discussed. While in the section of learning techniques, supervised, unsupervised, and reinforcement learning are briefly considered. Section 1.4 has been reserved to discuss about DL. It has been stated clearly in this section why the term âdeepâ has been used. Multiple points have been identified, which makes DL as state of the art. In Section 1.6, different activation functions such as sigmoid activation function, hyperbolic tangent activation function, rectified linear unit (ReLU) activation function, and softmax activation function are described in detail. There are many DL architectures available in literature. Few of them became very popular and offers high accuracy resulting in better performance. The concepts of restricted Boltzmann machine (RBM), deep belief network (DBN), autoencoder (AE), and convolutional neural network (CNN) are deliberated in this section. In Section 1.6, multiple ML platforms from different organizations have been furnished. All of them provide cloud infrastructures with high-performance graphics processing units (GPUs) to quicken the training of DL network with the huge volumes of data, which lessens the training time from weeks to hours. The last section of this chapter is dedicated for describing different steps of using IBM ML platform â IBM Watson Studio (formerly, Data Science Experience or DSX).
1.2 AI versus ML versus DL
AI is a subcategory of computer science that handles the simulation of intelligent activities in computers. AI is a computer system, which can accomplish responsibilities that usually need human acumen. Generally, a rule engine leads the AI system and a good AI system should have an intelligent rule engine, which is based on a series of meaningful IFâTHEN statements. Since the 1950s, AI has been successfully used in visual perception, speech recognition, decision-making, and translation between languages. AI and ML are often used interchangeably, especially in the realm of big data.
As shown in Figure 1.1, DL is considered as a subcategory of ML and again ML is a subcategory of AI. In other words, all DL i...