Deep Reinforcement Learning in Action
eBook - ePub

Deep Reinforcement Learning in Action

Brandon Brown, Alexander Zai

Partager le livre
  1. 384 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Deep Reinforcement Learning in Action

Brandon Brown, Alexander Zai

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

Summary Humans learn best from feedback—we are encouraged to take actions that lead to positive results while deterred by decisions with negative consequences. This reinforcement process can be applied to computer programs allowing them to solve more complex problems that classical programming cannot. Deep Reinforcement Learning in Action teaches you the fundamental concepts and terminology of deep reinforcement learning, along with the practical skills and techniques you'll need to implement it into your own projects. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology Deep reinforcement learning AI systems rapidly adapt to new environments, a vast improvement over standard neural networks. A DRL agent learns like people do, taking in raw data such as sensor input and refining its responses and predictions through trial and error. About the book Deep Reinforcement Learning in Action teaches you how to program AI agents that adapt and improve based on direct feedback from their environment. In this example-rich tutorial, you'll master foundational and advanced DRL techniques by taking on interesting challenges like navigating a maze and playing video games. Along the way, you'll work with core algorithms, including deep Q-networks and policy gradients, along with industry-standard tools like PyTorch and OpenAI Gym. What's inside Building and training DRL networks
The most popular DRL algorithms for learning and problem solving
Evolutionary algorithms for curiosity and multi-agent learning
All examples available as Jupyter Notebooks About the reader For readers with intermediate skills in Python and deep learning. About the author Alexander Zai is a machine learning engineer at Amazon AI. Brandon Brown is a machine learning and data analysis blogger. Table of Contents PART 1 - FOUNDATIONS 1. What is reinforcement learning? 2. Modeling reinforcement learning problems: Markov decision processes 3. Predicting the best states and actions: Deep Q-networks 4. Learning to pick the best policy: Policy gradient methods 5. Tackling more complex problems with actor-critic methods PART 2 - ABOVE AND BEYOND 6. Alternative optimization methods: Evolutionary algorithms 7. Distributional DQN: Getting the full story 8.Curiosity-driven exploration 9. Multi-agent reinforcement learning 10. Interpretable reinforcement learning: Attention and relational models 11. In conclusion: A review and roadmap

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Deep Reinforcement Learning in Action est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Deep Reinforcement Learning in Action par Brandon Brown, Alexander Zai en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Informatique et Intelligence artificielle (IA) et sĂ©mantique. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Éditeur
Manning
Année
2020
ISBN
9781638350507

Part 1. Foundations

Part 1 consists of five chapters that teach the most fundamental aspects of deep reinforcement learning. After reading part 1, you’ll be able to understand the chapters in part 2 in any order.
Chapter 1 begins with a high-level introduction to deep reinforcement learning, explaining its main concepts and its utility. In chapter 2 we’ll start building practical projects that illustrate the basic ideas of reinforcement learning. In chapter 3 we’ll implement a deep Q-network—the same kind of algorithm that DeepMind famously used to play Atari games at superhuman levels.
Chapters 4 and 5 round out the most common reinforcement learning algorithms, namely policy gradient methods and actor-critic methods. We’ll look at the pros and cons of these approaches compared to deep Q-networks.

Chapter 1. What is reinforcement learning?

This chapter covers
  • A brief review of machine learning
  • Introducing reinforcement learning as a subfield
  • The basic framework of reinforcement learning
Computer languages of the future will be more concerned with goals and less with procedures specified by the programmer.
Marvin Minksy, 1970 ACM Turing Lecture
If you’re reading this book, you are probably familiar with how deep neural networks are used for things like image classification or prediction (and if not, just keep reading; we also have a crash course in deep learning in the appendix). Deep reinforcement learning (DRL) is a subfield of machine learning that utilizes deep learning models (i.e., neural networks) in reinforcement learning (RL) tasks (to be defined in section 1.2). In image classification we have a bunch of images that correspond to a set of discrete categories, such as images of different kinds of animals, and we want a machine learning model to interpret an image and classify the kind of animal in the image, as in figure 1.1.
Figure 1.1. An image classifier is a function or learning algorithm that takes in an image and returns a class label, classifying the image into one of a finite number of possible categories or classes.

1.1. The “deep” in deep reinforcement learning

Deep learning models are just one of many kinds of machine learning models we can use to classify images. In general, we just need some sort of function that takes in an image and returns a class label (in this case, the label identifying which kind of animal is depicted in the image), and usually this function has a fixed set of adjustable parameters—we call these kinds of models parametric models. We start with a parametric model whose parameters are initialized to random values—this will produce random class labels for the input images. Then we use a training procedure to adjust the parameters so the function iteratively gets better and better at correctly classifying the images. At some point, the parameters will be at an optimal set of values, meaning that the model cannot get any better at the classification task. Parametric models can also be used for regression, where we try to fit a model to a set of data so we can make predictions for unseen data (figure 1.2). A more sophisticated approach might perform even better if it had more parameters or a better internal architecture.
Figure 1.2. Perhaps the simplest machine learning model is a simple linear function of the form f(x) = mx + b, with parameters m (the slope) and b (the intercept). Since it has adjustable parameters, we call it a parametric function or model. If we have some 2-dimensional data, we can start with a randomly initialized set of parameters, such as [m = 3.4, b = 0.3], and then use a training algorithm to optimize the parameters to fit the training data, in which case the optimal set of parameters is close to [m = 2, b = 1].
Deep neural networks are popular because they are in many cases the most accurate parametric machine learning models for a given task, like image classification. This is largely due to the way they represent data. Deep neural networks have many layers (hence the “deep”), which induces the model to learn layered representations of input data. This layered representation is a form of compositionality, meaning that a complex piece of data is represented as the combination of more elementary components, and those components can be further broken down into even simpler components, and so on, until you get to atomic units.
Human language is compositional (figure 1.3). For example, a book is composed of chapters, chapters are composed of paragraphs, paragraphs are composed of sentences, and so on, until you get to individual words, which are the smallest units of meaning. Yet each individual level conveys meaning—an entire book is meant to convey meaning, and its individual paragraphs are meant to convey smaller points. Deep neural networks can likewise learn a compositional representation of data—for example, they can represent an image as the composition of primitive contours and textures, which are composed into elementary shapes, and so on, until you get the complete, complex image. This ability to handle complexity with compositional representations is largely what makes deep learning so powerful.
Figure 1.3. A sentence like “John hit the ball” can be decomposed into simpler and simpler parts until we get the individual words. In this case, we can decompose the sentence (denoted S) into a subject noun (N) and a verb phrase (VP). The VP can be further decomposed into a verb, “hit,” and a noun phrase (NP). The NP can then be decomposed into the individual words “the” and “ball.”

1.2. Reinforcement learning

It is important to distinguish between problems and their solutions, or in other words, between the tasks we wish to solve and the algorithms we design to solve them. Deep learning algorithms can be applied to many problem types and tasks. Image classification and prediction tasks are common applications of deep learning because automated image processing before deep learning was very limited, given the complexity of images. But there are many other kinds of tasks we might wish to automate, such as driving a car or balancing a portfolio of stocks and other assets. Driving a car includes some amount of image processing, but more importantly the algorithm needs to learn how to act, not merely to classify or predict. These kinds of problems, where decisions must be made or some behavior must be enacted, are collectively called control tasks.
Reinforcement learning is a generic framework for representing and solving control tasks, but within this framework we are free to choose which algorithms we want to apply to a particular control task (figure 1.4). Deep learning algorithms are a natural choice as they are able to process complex data efficiently, and this is why we’ll focus on deep reinforcement learning, but much of what you’ll learn in this book is the general reinforcement framework for control tasks (see figure 1.5). Then we’ll look at how you can design an appropriate deep learning model to fit the framework and solve a task. This means you will learn a lot about reinforcement learning, and you’ll probably will learn some things about deep learning that you didn’t know as well.
Figure 1.4. As opposed to an image classifier, a reinforcement learning algorithm dynamically interacts with data. It continually consumes data and decides what actions to take—actions that will change the subsequent data presented to it. A video game screen might be input data for an RL algorithm, which then decides which action to take using the game controller, and this causes the game to update (e.g. the player moves or fires a weapon).
Figure 1.5. Deep learning is a subfield of machine learning. Deep learning algorithms can be used to power RL approaches to solving control tasks.
One added complexity of moving from image processing to the domain of control tasks is the additional element of time. With image processing, we usually train a deep learning algorithm on a fixed data set of images. After a sufficient amount of training, we typically get a high-performance algorithm that we can deploy to some new, unseen images. We can think of the data set as a “space” of data, where similar images are closer together in this abstract space ...

Table des matiĂšres