eBook - ePub

Grokking Deep Reinforcement Learning

Name: Grokking Deep Reinforcement Learning
Author: Miguel Morales

Miguel Morales,

472 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Grokking Deep Reinforcement Learning

Miguel Morales,

About this book

Grokking Deep Reinforcement Learning uses engaging exercises to teach you how to build deep learning systems. This book combines annotated Python code with intuitive explanations to explore DRL techniques. You'll see how algorithms function and learn to develop your own DRL agents using evaluative feedback. Summary
We all learn through trial and error. We avoid the things that cause us to experience pain and failure. We embrace and build on the things that give us reward and success. This common pattern is the foundation of deep reinforcement learning: building machine learning systems that explore and learn based on the responses of the environment. Grokking Deep Reinforcement Learning introduces this powerful machine learning approach, using examples, illustrations, exercises, and crystal-clear teaching. You'll love the perfectly paced teaching and the clever, engaging writing style as you dig into this awesome exploration of reinforcement learning fundamentals, effective deep learning techniques, and practical applications in this emerging field. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology
We learn by interacting with our environment, and the rewards or punishments we experience guide our future behavior. Deep reinforcement learning brings that same natural process to artificial intelligence, analyzing results to uncover the most efficient ways forward. DRL agents can improve marketing campaigns, predict stock performance, and beat grand masters in Go and chess. About the book
Grokking Deep Reinforcement Learning uses engaging exercises to teach you how to build deep learning systems. This book combines annotated Python code with intuitive explanations to explore DRL techniques. You'll see how algorithms function and learn to develop your own DRL agents using evaluative feedback. What's inside
An introduction to reinforcement learning
DRL agents with human-like behaviors
Applying DRL to complex situations About the reader
For developers with basic deep learning experience. About the author
Miguel Morales works on reinforcement learning at Lockheed Martin and is an instructor for the Georgia Institute of Technology's Reinforcement Learning and Decision Making course. Table of Contents 1 Introduction to deep reinforcement learning 2 Mathematical foundations of reinforcement learning 3 Balancing immediate and long-term goals 4 Balancing the gathering and use of information 5 Evaluating agents' behaviors 6 Improving agents' behaviors 7 Achieving goals more effectively and efficiently 8 Introduction to value-based deep reinforcement learning 9 More stable value-based methods 10 Sample-efficient value-based methods 11 Policy-gradient and actor-critic methods 12 Advanced actor-critic methods 13 Toward artificial general intelligence

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Grokking Deep Reinforcement Learning by Miguel Morales in PDF and/or ePUB format, as well as other popular books in Computer Science & Artificial Intelligence (AI) & Semantics. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Year

Print ISBN

eBook ISBN

Topic

Subtopic

Artificial Intelligence (AI) & Semantics

Index

Computer Science

1 Introduction to deep reinforcement learning

In this chapter

You will learn what deep reinforcement learning is and how it is different from other machine learning approaches.
You will learn about the recent progress in deep reinforcement learning and what it can do for a variety of problems.
You will know what to expect from this book and how to get the most out of it.

I visualize a time when we will be to robots what dogs are to humans, and I’m rooting for the machines.

— Claude Shannon Father of the information age and contributor to the field of artificial intelligence

Humans naturally pursue feelings of happiness. From picking out our meals to advancing our careers, every action we choose is derived from our drive to experience rewarding moments in life. Whether these moments are self-centered pleasures or the more generous of goals, whether they bring us immediate gratification or long-term success, they’re still our perception of how important and valuable they are. And to some extent, these moments are the reason for our existence.

Our ability to achieve these precious moments seems to be correlated with intelligence; “intelligence” is defined as the ability to acquire and apply knowledge and skills. People who are deemed by society as intelligent are capable of trading not only immediate satisfaction for long-term goals, but also a good, certain future for a possibly better, yet uncertain, one. Goals that take longer to materialize and that have unknown long-term value are usually the hardest to achieve, and those who can withstand the challenges along the way are the exception, the leaders, the intellectuals of society.

In this book, you learn about an approach, known as deep reinforcement learning, involved with creating computer programs that can achieve goals that require intelligence. In this chapter, I introduce deep reinforcement learning and give suggestions to get the most out of this book.

What is deep reinforcement learning?

Deep reinforcement learning (DRL) is a machine learning approach to artificial intelligence concerned with creating computer programs that can solve problems requiring intelligence. The distinct property of DRL programs is learning through trial and error from feedback that’s simultaneously sequential, evaluative, and sampled by leveraging powerful non-linear function approximation.

I want to unpack this definition for you one bit at a time. But, don’t get too caught up with the details because it’ll take me the whole book to get you grokking deep reinforcement learning. The following is the introduction to what you learn about in this book. As such, it’s repeated and explained in detail in the chapters ahead.

If I succeed with my goal for this book, after you complete it, you should understand this definition precisely. You should be able to tell why I used the words that I used, and why I didn’t use more or fewer words. But, for this chapter, simply sit back and plow through it.

Deep reinforcement learning is a machine learning approach to artificial intelligence

artificial intelligence (AI) is a branch of computer science involved in the creation of computer programs capable of demonstrating intelligence. Traditionally, any piece of software that displays cognitive abilities such as perception, search, planning, and learning is considered part of AI. Several examples of functionality produced by AI software are

The pages returned by a search engine
The route produced by a GPS app
The voice recognition and the synthetic voice of smart-assistant software
The products recommended by e-commerce sites
The follow-me feature in drones

Subfields of artificial intelligence

All computer programs that display intelligence are considered AI, but not all examples of AI can learn. machine learning (ML) is the area of AI concerned with creating computer programs that can solve problems requiring intelligence by learning from data. There are three main branches of ML: supervised, unsupervised, and reinforcement learning.

Main branches of machine learning

supervised learning (SL) is the task of learning from labeled data. In SL, a human decides which data to collect and how to label it. The goal in SL is to generalize. A classic example of SL is a handwritten-digit-recognition application: a human gathers images with handwritten digits, labels those images, and trains a model to recognize and classify digits in images correctly. The trained model is expected to generalize and correctly classify handwritten digits in new images.

unsupervised learning (UL) is the task of learning from unlabeled data. Even though data no longer needs labeling, the methods used by the computer to gather data still need to be designed by a human. The goal in UL is to compress. A classic example of UL is a customer segmentation application; a human collects customer data and trains a model to group customers into clusters. These clusters compress the information, uncovering underlying relationships in customers.

reinforcement learning (RL) is the task of learning through trial and error. In this type of task, no human labels data, and no human collects or explicitly designs the collection of data. The goal in RL is to act. A classic example of RL is a Pong-playing agent; the agent repeatedly interacts with a Pong emulator and learns by taking actions and observing their effects. The trained agent is expected to act in such a way that it successfully plays Pong.

A powerful recent approach to ML, called deep learning (DL), involves using multi-layered non-linear function approximation, typically neural networks. DL isn’t a separate branch of ML, so it’s not a different task than those described previously. DL is a collection of techniques and methods for using neural networks to solve ML tasks, whether SL, UL, or RL. DRL is simply the use of DL to solve RL tasks.

Deep learning is a powerful toolbox

The bottom line is that DRL is an approach to a problem. The field of AI defines the problem: creating intelligent machines. One of the approaches to solving that problem is DRL. Throughout the book, will you find comparisons between RL and other ML approaches, but only in this chapter will you find definitions and a historical overview of AI in general. It’s important to note that the field of RL includes the field of DRL, so although I make a distinction when necessary, when I refer to RL, remember that DRL is included.

Deep reinforcement learning is concerned with creating computer programs

At its core, DRL is about complex sequential decision-making problems under uncertainty. But, this is a topic of interest in many fields; for instance, control theory (CT) studies ways to control complex known dynamic systems. In CT, the dynamics of the systems we try to control are usually known in advance. Operations research (OR), another instance, also studies decision-making under uncertainty, but problems in this field often have much larger action spaces than those commonly seen in DRL. psychology studies human behavior, which is partly the same “complex sequential decision-making under uncertainty” problem.

The synergy between similar fields

The bottom line is that you have come to a field that’s influenced by a variety of others. Although this is a good thing, it also brings inconsistencies in terminologies, notations, and so on. My take is the computer science approach to this problem, so this book is about building computer programs that solve complex decision-making problems under uncertainty, and as such, you can find code examples throughout the book.

In DRL, these computer programs are called agents. An agent is a decision maker Only and nothing else. That means if you’re training a robot to pick up objects, the robot arm isn’t part of the agent. Only the code that makes decisions is referred to as the agent.

Deep reinforcement learning agents can solve problems that require intelligence

On the other side of the agent is the environment. The environment is everything outside the agent; everything the agent has no total control over. Again, imagine you’re training a robot to pick up objects. The objects to be picked up, the tray where the objects lay, the wind, and everything outside the decision maker are part of the environment. That means the robot arm is also part of the environment because it isn’t part of the agent. And even though the agent can decide to move the arm, the actual arm movement is noisy, and thus the arm is part of the environment.

This strict boundary between the agent and the environment is counterintuitive at first, but the decision maker, the agent, can only have a single role: making decisions. Everything that comes after the decision gets bundled into the environment.

Boundary between agent and environment

Chapter 2 provides an in-depth survey of all the components of DRL. The following is a preview of what you’ll learn in chapter 2.

The environment is represented by a set of variables related to the problem. For instance, in the robotic arm example, the location and velocities of the arm would be part of the variables that make up the environment. This set of variables and all the possible values that they can take are referred to as the state space. A state is an instantiation of the state space, a set of values the variables take.

Interestingly, often, ag...

Grokking Deep Reinforcement Learning
Copyright
dedication
contents
front matter
1 Introduction to deep reinforcement learning
2 Mathematical foundations of reinforcement learning
3 Balancing immediate and long-term goals
4 Balancing the gathering and use of information
5 Evaluating agents’ behaviors
6 Improving agents’ behaviors
7 Achieving goals more effectively and efficiently
8 Introduction to value-based deep reinforcement learning
9 More stable value-based methods
10 Sample-efficient value-based methods
11 Policy-gradient and actor-critic methods
12 Advanced actor-critic methods
13 Toward artificial general intelligence
index