Deep Reinforcement Learning in Action
eBook - ePub

Deep Reinforcement Learning in Action

Brandon Brown, Alexander Zai

Share book
  1. 384 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Deep Reinforcement Learning in Action

Brandon Brown, Alexander Zai

Book details
Book preview
Table of contents
Citations

About This Book

Summary Humans learn best from feedback—we are encouraged to take actions that lead to positive results while deterred by decisions with negative consequences. This reinforcement process can be applied to computer programs allowing them to solve more complex problems that classical programming cannot. Deep Reinforcement Learning in Action teaches you the fundamental concepts and terminology of deep reinforcement learning, along with the practical skills and techniques you'll need to implement it into your own projects. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Deep reinforcement learning AI systems rapidly adapt to new environments, a vast improvement over standard neural networks. A DRL agent learns like people do, taking in raw data such as sensor input and refining its responses and predictions through trial and error. About the book Deep Reinforcement Learning in Action teaches you how to program AI agents that adapt and improve based on direct feedback from their environment. In this example-rich tutorial, you'll master foundational and advanced DRL techniques by taking on interesting challenges like navigating a maze and playing video games. Along the way, you'll work with core algorithms, including deep Q-networks and policy gradients, along with industry-standard tools like PyTorch and OpenAI Gym. What's inside Building and training DRL networks
The most popular DRL algorithms for learning and problem solving
Evolutionary algorithms for curiosity and multi-agent learning
All examples available as Jupyter Notebooks About the reader For readers with intermediate skills in Python and deep learning. About the author Alexander Zai is a machine learning engineer at Amazon AI. Brandon Brown is a machine learning and data analysis blogger. Table of Contents PART 1 - FOUNDATIONS 1. What is reinforcement learning? 2. Modeling reinforcement learning problems: Markov decision processes 3. Predicting the best states and actions: Deep Q-networks 4. Learning to pick the best policy: Policy gradient methods 5. Tackling more complex problems with actor-critic methods PART 2 - ABOVE AND BEYOND 6. Alternative optimization methods: Evolutionary algorithms 7. Distributional DQN: Getting the full story 8.Curiosity-driven exploration 9. Multi-agent reinforcement learning 10. Interpretable reinforcement learning: Attention and relational models 11. In conclusion: A review and roadmap

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Deep Reinforcement Learning in Action an online PDF/ePUB?
Yes, you can access Deep Reinforcement Learning in Action by Brandon Brown, Alexander Zai in PDF and/or ePUB format, as well as other popular books in Informatique & Intelligence artificielle (IA) et sémantique. We have over one million books available in our catalogue for you to explore.

Information

Part 1. Foundations

Part 1 consists of five chapters that teach the most fundamental aspects of deep reinforcement learning. After reading part 1, you’ll be able to understand the chapters in part 2 in any order.
Chapter 1 begins with a high-level introduction to deep reinforcement learning, explaining its main concepts and its utility. In chapter 2 we’ll start building practical projects that illustrate the basic ideas of reinforcement learning. In chapter 3 we’ll implement a deep Q-network—the same kind of algorithm that DeepMind famously used to play Atari games at superhuman levels.
Chapters 4 and 5 round out the most common reinforcement learning algorithms, namely policy gradient methods and actor-critic methods. We’ll look at the pros and cons of these approaches compared to deep Q-networks.

Chapter 1. What is reinforcement learning?

This chapter covers
  • A brief review of machine learning
  • Introducing reinforcement learning as a subfield
  • The basic framework of reinforcement learning
Computer languages of the future will be more concerned with goals and less with procedures specified by the programmer.
Marvin Minksy, 1970 ACM Turing Lecture
If you’re reading this book, you are probably familiar with how deep neural networks are used for things like image classification or prediction (and if not, just keep reading; we also have a crash course in deep learning in the appendix). Deep reinforcement learning (DRL) is a subfield of machine learning that utilizes deep learning models (i.e., neural networks) in reinforcement learning (RL) tasks (to be defined in section 1.2). In image classification we have a bunch of images that correspond to a set of discrete categories, such as images of different kinds of animals, and we want a machine learning model to interpret an image and classify the kind of animal in the image, as in figure 1.1.
Figure 1.1. An image classifier is a function or learning algorithm that takes in an image and returns a class label, classifying the image into one of a finite number of possible categories or classes.

1.1. The “deep” in deep reinforcement learning

Deep learning models are just one of many kinds of machine learning models we can use to classify images. In general, we just need some sort of function that takes in an image and returns a class label (in this case, the label identifying which kind of animal is depicted in the image), and usually this function has a fixed set of adjustable parameters—we call these kinds of models parametric models. We start with a parametric model whose parameters are initialized to random values—this will produce random class labels for the input images. Then we use a training procedure to adjust the parameters so the function iteratively gets better and better at correctly classifying the images. At some point, the parameters will be at an optimal set of values, meaning that the model cannot get any better at the classification task. Parametric models can also be used for regression, where we try to fit a model to a set of data so we can make predictions for unseen data (figure 1.2). A more sophisticated approach might perform even better if it had more parameters or a better internal architecture.
Figure 1.2. Perhaps the simplest machine learning model is a simple linear function of the form f(x) = mx + b, with parameters m (the slope) and b (the intercept). Since it has adjustable parameters, we call it a parametric function or model. If we have some 2-dimensional data, we can start with a randomly initialized set of parameters, such as [m = 3.4, b = 0.3], and then use a training algorithm to optimize the parameters to fit the training data, in which case the optimal set of parameters is close to [m = 2, b = 1].
Deep neural networks are popular because they are in many cases the most accurate parametric machine learning models for a given task, like image classification. This is largely due to the way they represent data. Deep neural networks have many layers (hence the “deep”), which induces the model to learn layered representations of input data. This layered representation is a form of compositionality, meaning that a complex piece of data is represented as the combination of more elementary components, and those components can be further broken down into even simpler components, and so on, until you get to atomic units.
Human language is compositional (figure 1.3). For example, a book is composed of chapters, chapters are composed of paragraphs, paragraphs are composed of sentences, and so on, until you get to individual words, which are the smallest units of meaning. Yet each individual level conveys meaning—an entire book is meant to convey meaning, and its individual paragraphs are meant to convey smaller points. Deep neural networks can likewise learn a compositional representation of data—for example, they can represent an image as the composition of primitive contours and textures, which are composed into elementary shapes, and so on, until you get the complete, complex image. This ability to handle complexity with compositional representations is largely what makes deep learning so powerful.
Figure 1.3. A sentence like “John hit the ball” can be decomposed into simpler and simpler parts until we get the individual words. In this case, we can decompose the sentence (denoted S) into a subject noun (N) and a verb phrase (VP). The VP can be further decomposed into a verb, “hit,” and a noun phrase (NP). The NP can then be decomposed into the individual words “the” and “ball.”

1.2. Reinforcement learning

It is important to distinguish between problems and their solutions, or in other words, between the tasks we wish to solve and the algorithms we design to solve them. Deep learning algorithms can be applied to many problem types and tasks. Image classification and prediction tasks are common applications of deep learning because automated image processing before deep learning was very limited, given the complexity of images. But there are many other kinds of tasks we might wish to automate, such as driving a car or balancing a portfolio of stocks and other assets. Driving a car includes some amount of image processing, but more importantly the algorithm needs to learn how to act, not merely to classify or predict. These kinds of problems, where decisions must be made or some behavior must be enacted, are collectively called control tasks.
Reinforcement learning is a generic framework for representing and solving control tasks, but within this framework we are free to choose which algorithms we want to apply to a particular control task (figure 1.4). Deep learning algorithms are a natural choice as they are able to process complex data efficiently, and this is why we’ll focus on deep reinforcement learning, but much of what you’ll learn in this book is the general reinforcement framework for control tasks (see figure 1.5). Then we’ll look at how you can design an appropriate deep learning model to fit the framework and solve a task. This means you will learn a lot about reinforcement learning, and you’ll probably will learn some things about deep learning that you didn’t know as well.
Figure 1.4. As opposed to an image classifier, a reinforcement learning algorithm dynamically interacts with data. It continually consumes data and decides what actions to take—actions that will change the subsequent data presented to it. A video game screen might be input data for an RL algorithm, which then decides which action to take using the game controller, and this causes the game to update (e.g. the player moves or fires a weapon).
Figure 1.5. Deep learning is a subfield of machine learning. Deep learning algorithms can be used to power RL approaches to solving control tasks.
One added complexity of moving from image processing to the domain of control tasks is the additional element of time. With image processing, we usually train a deep learning algorithm on a fixed data set of images. After a sufficient amount of training, we typically get a high-performance algorithm that we can deploy to some new, unseen images. We can think of the data set as a “space” of data, where similar images are closer together in this abstract space ...

Table of contents