11 Facts About Reinforcement learning

Reinforcement learning is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward.

FactSnippet No. 487,240

Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

FactSnippet No. 487,241

The main difference between the classical dynamic programming methods and reinforcement learning algorithms is that the latter do not assume knowledge of an exact mathematical model of the MDP and they target large MDPs where exact methods become infeasible.

FactSnippet No. 487,242

The problems of interest in reinforcement learning have been studied in the theory of optimal control, which is concerned mostly with the existence and characterization of optimal solutions, and algorithms for their exact computation, and less with learning or approximation, particularly in the absence of a mathematical model of the environment.

FactSnippet No. 487,243

Purpose of reinforcement learning is for the agent to learn an optimal, or nearly-optimal, policy that maximizes the "reward function" or other user-provided reinforcement signal that accumulates from the immediate rewards.

FactSnippet No. 487,244

The goal of a reinforcement learning agent is to learn a policy:, which maximizes the expected cumulative reward.

FactSnippet No. 487,245

Thus, reinforcement learning is particularly well-suited to problems that include a long-term versus short-term reward trade-off.

FactSnippet No. 487,246

Thanks to these two key components, reinforcement learning can be used in large environments in the following situations:.

FactSnippet No. 487,247

Reinforcement learning requires clever exploration mechanisms; randomly selecting actions, without reference to an estimated probability distribution, shows poor performance.

FactSnippet No. 487,248

10.

Adversarial deep reinforcement learning is an active area of research in reinforcement learning focusing on vulnerabilities of learned policies.

FactSnippet No. 487,249

11.

Partially supervised approaches can alleviate the need for extensive training data in supervised Reinforcement learning while reducing the need for costly exhaustive random exploration in pure RL.

FactSnippet No. 487,250