UC Berkeley – Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability
Many experimental works have observed that generalization in deep RL appears to be difficult: although RL agents can learn to perform very complex tasks, they don’t seem to generalize over diverse task distributions as well as the excellent generalization of supervised deep nets might lead us to expect. In this blog post, we will aim…