Reinforcement Learning for Cyber-Physical Systems
eBook - ePub

Reinforcement Learning for Cyber-Physical Systems

with Cybersecurity Case Studies

  1. 238 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Reinforcement Learning for Cyber-Physical Systems

with Cybersecurity Case Studies

About this book

Reinforcement Learning for Cyber-Physical Systems: with Cybersecurity Case Studies was inspired by recent developments in the fields of reinforcement learning (RL) and cyber-physical systems (CPSs). Rooted in behavioral psychology, RL is one of the primary strands of machine learning. Different from other machine learning algorithms, such as supervised learning and unsupervised learning, the key feature of RL is its unique learning paradigm, i.e., trial-and-error. Combined with the deep neural networks, deep RL become so powerful that many complicated systems can be automatically managed by AI agents at a superhuman level. On the other hand, CPSs are envisioned to revolutionize our society in the near future. Such examples include the emerging smart buildings, intelligent transportation, and electric grids.

However, the conventional hand-programming controller in CPSs could neither handle the increasing complexity of the system, nor automatically adapt itself to new situations that it has never encountered before. The problem of how to apply the existing deep RL algorithms, or develop new RL algorithms to enable the real-time adaptive CPSs, remains open. This book aims to establish a linkage between the two domains by systematically introducing RL foundations and algorithms, each supported by one or a few state-of-the-art CPS examples to help readers understand the intuition and usefulness of RL techniques.

Features

  • Introduces reinforcement learning, including advanced topics in RL
  • Applies reinforcement learning to cyber-physical systems and cybersecurity
  • Contains state-of-the-art examples and exercises in each chapter
  • Provides two cybersecurity case studies

Reinforcement Learning for Cyber-Physical Systems with Cybersecurity Case Studies is an ideal text for graduate students or junior/senior undergraduates in the fields of science, engineering, computer science, or applied mathematics. It would also prove useful to researchers and engineers interested in cybersecurity, RL, and CPS. The only background knowledge required to appreciate the book is a basic knowledge of calculus and probability theory.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Reinforcement Learning for Cyber-Physical Systems by Chong Li,Meikang Qiu in PDF and/or ePUB format, as well as other popular books in Computer Science & Computer Engineering. We have over one million books available in our catalogue for you to explore.

II

Reinforcement Learning for CyberPhysical Systems

CHAPTER 3

Reinforcement Learning Problems

CONTENTS

3.1 Multi-Armed Bandit Problem
3.1.1 ϵ-Greedy
3.1.2 Softmax Algorithm
3.1.3 UCB Algorithm
3.2 Contextual Bandit Problem
3.2.1 LinUCB Algorithm
3.3 Reinforcement Learning Problem
3.3.1 Elements of RL
3.3.2 Introduction of Markov Decision Process
3.3.3 Value Functions
3.4 Remarks
3.5 Exercises
In this chapter we formally introduce the reinforcement learning (RL) problem and its framework. We begin with two important simplified versions of the RL problem: Bandits and Contextual Bandits. The transition from Bandits to Contextual Bandits and then Contextual Bandits to Reinforcement Learning should seem natural and straightforward.

3.1 Multi-Armed Bandit Problem

The multi-armed bandit (MAB) problem, also referred to as the k-armed bandit problem, is one where the bandit has to pick among a discrete set of actions to maximize the expected return. The problem takes its name from a toy example where an agent has entered a casino and seeks to maximize earnings at a row of slot machines. However, upon entering the casino, the agent does not know which of the machines has the highest expected payout. The agent must therefore devise a strategy to both simultaneously learn the payout distribution for each slot machine and exploit his existing knowledge as to which machine is most lucrative. For now, we assume the payout distribution is stationary, that is, the payout distribution does not change over time. Despite its name, the MAB is constrained to play according to the following repetitive process: choose a slot machine to play, pull the selected machine’s lever, and observe the machine’s payout. As we build up to the full reinforcement learning problem, we want to point out some common elements and themes that emerge from the MAB problem that will also be common and critical to the RL problem.
First, we identify two entities in the problem: the agent and the row of slot machines. Our goal is to develop strategies to maximize the agent’s return. Also, the slot machines and their true payout distributions are referred to as the environment, which is unknown to the agent.
Next, we introduce the concepts of reward and return. In the MAB problem, the reward is the money earned or lost after each arm pull. Whereas, the return is our cumulative profit or loss across all arm pulls since the agent entered the casino. The agent seeks to develop an algorithm that uses the history of the arm choices and the associated rewards to maximize the return. Formally, the return G is just the sum of the rewards rt across time,
G=t=1Trt.(3.1)
Upon entering the casino, the agent knows nothing about his environment (the payout distribution of each machine). To maximize the return, the agent must somehow reliably identify the machine with the highest expected reward and then play that machine repeatedly. With no prior machine knowledge, the agent must first explore his environment in order to learn how to most effectively exploit it. Since slot machine rewards are assumed to be stochastic or non-deterministic, the agent must play each machine multiple times to learn its reward distribution. As these distributions become mor...

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Table of Contents
  6. Dedication Page
  7. Preface
  8. Auhtor Bios
  9. I Introduction
  10. II Reinforcement Learning for CyberPhysical Systems
  11. III Case Studies
  12. Bibliography
  13. Index