Mastering Reinforcement Learning with Python
Build next-generation, self-learning models using reinforcement learning techniques and best practices
Enes Bilgin
- 532 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Mastering Reinforcement Learning with Python
Build next-generation, self-learning models using reinforcement learning techniques and best practices
Enes Bilgin
About This Book
Get hands-on experience in creating state-of-the-art reinforcement learning agents using TensorFlow and RLlib to solve complex real-world business and industry problems with the help of expert tips and best practices
Key Features
- Understand how large-scale state-of-the-art RL algorithms and approaches work
- Apply RL to solve complex problems in marketing, robotics, supply chain, finance, cybersecurity, and more
- Explore tips and best practices from experts that will enable you to overcome real-world RL challenges
Book Description
Reinforcement learning (RL) is a field of artificial intelligence (AI) used for creating self-learning autonomous agents. Building on a strong theoretical foundation, this book takes a practical approach and uses examples inspired by real-world industry problems to teach you about state-of-the-art RL.
Starting with bandit problems, Markov decision processes, and dynamic programming, the book provides an in-depth review of the classical RL techniques, such as Monte Carlo methods and temporal-difference learning. After that, you will learn about deep Q-learning, policy gradient algorithms, actor-critic methods, model-based methods, and multi-agent reinforcement learning. Then, you'll be introduced to some of the key approaches behind the most successful RL implementations, such as domain randomization and curiosity-driven learning.
As you advance, you'll explore many novel algorithms with advanced implementations using modern Python libraries such as TensorFlow and Ray's RLlib package. You'll also find out how to implement RL in areas such as robotics, supply chain management, marketing, finance, smart cities, and cybersecurity while assessing the trade-offs between different approaches and avoiding common pitfalls.
By the end of this book, you'll have mastered how to train and deploy your own RL agents for solving RL problems.
What you will learn
- Model and solve complex sequential decision-making problems using RL
- Develop a solid understanding of how state-of-the-art RL methods work
- Use Python and TensorFlow to code RL algorithms from scratch
- Parallelize and scale up your RL implementations using Ray's RLlib package
- Get in-depth knowledge of a wide variety of RL topics
- Understand the trade-offs between different RL approaches
- Discover and address the challenges of implementing RL in the real world
Who this book is for
This book is for expert machine learning practitioners and researchers looking to focus on hands-on reinforcement learning with Python by implementing advanced deep reinforcement learning concepts in real-world projects. Reinforcement learning experts who want to advance their knowledge to tackle large-scale and complex sequential decision-making problems will also find this book useful. Working knowledge of Python programming and deep learning along with prior experience in reinforcement learning is required.
Frequently asked questions
Information
Section 1: Reinforcement Learning Foundations
- Chapter 1, Introduction to Reinforcement Learning
- Chapter 2, Multi-Armed Bandits
- Chapter 3, Contextual Bandits
- Chapter 4, Makings of a Markov Decision Process
- Chapter 5, Solving the Reinforcement Learning Problem
Chapter 1: Introduction to Reinforcement Learning
- Why reinforcement learning?
- The three paradigms of ML
- RL application areas and success stories
- Elements of a RL problem
- Setting up your RL environment
Why reinforcement learning?
The three paradigms of ML
Supervised learning
- An image recognition model that classifies the objects on the camera of a self-driving car as pedestrian, stop sign, truck
- A forecasting model that predicts the customer demand of a product for a particular holiday season using past sales data.
- During training, models learn from ground truth labels/output provided by a supervisor (which could be a human expert or a process),
- During inference, models make predictions about what the output might be given the input,
- Models use function approximators to represent the dynamics of the processes that generate the outputs.
Unsupervised learning
- Identifying homogenous segments on an image provided by the camera of a self-driving car. The model is likely to separate the sky, road and buildings based on the textures on the image.
- Clustering weekly sales data into 3 groups based on sales volume. The output is likely to be weeks with low, medium, and high sales volume.
- UL models don't know what the ground truth is, and there is no label to map the input to. They just identify the different patterns in the data. Even after doing so, for example, the model would not be aware that it separated sky from road, or a holiday week from a regular week.
- During inference, the model would cluster the input into one of the groups it had identified, again, without knowing what that group represents.
- Function approximators, such as neural networks, are used in some UL algorithms, bu...