O’Reilly Media – The Next Generation of AI

Programs like AlphaZero and GPT-3 are massive accomplishments: they represent years of sustained work solving a difficult problem. But these problems are squarely within the domain of traditional AI. Playing Chess and Go or building ever-better language models have been AI projects for decades. The following projects have a different flavor: In February, PLOS Genetics…

UC Berkeley – Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets

Fig. 1: The BRIDGE dataset contains 7200 demonstrations of kitchen-themed manipulation tasks across 71 tasks in 10 domains. Note that any GIF compression artifacts in this animation are not present in the dataset itself. When we apply robot learning methods to real-world systems, we must usually collect new datasets for every task, every robot, and…

UC Berkeley – Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Many experimental works have observed that generalization in deep RL appears to be difficult: although RL agents can learn to perform very complex tasks, they don’t seem to generalize over diverse task distributions as well as the excellent generalization of supervised deep nets might lead us to expect. In this blog post, we will aim…

UC Berkeley – RECON: Learning to Explore the Real World with a Ground Robot

An example of our method deployed on a Clearpath Jackal ground robot (left) exploring a suburban environment to find a visual target (inset). (Right) Egocentric observations of the robot. Imagine you’re in an unfamiliar neighborhood with no house numbers and I give you a photo that I took a few days ago of my house,…

UC Berkeley – Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Many experimental works have observed that generalization in deep RL appears to be difficult: although RL agents can learn to perform very complex tasks, they don’t seem to generalize over diverse task distributions as well as the excellent generalization of supervised deep nets might lead us to expect. In this blog post, we will aim…

UC Berkeley – A First-Principles Theory of Neural
Network Generalization

Fig 1. Measures of generalization performance for neural networks trained on four different boolean functions (colors) with varying training set size. For both MSE (left) and learnability (right), theoretical predictions (curves) closely match true performance (dots). Deep learning has proven a stunning success for countless problems of interest, but this success belies the fact that,…

UC Berkeley – Making RL Tractable by Learning More Informative Reward Functions: Example-Based Control, Meta-Learning, and Normalized Maximum Likelihood

Diagram of MURAL, our method for learning uncertainty-aware rewards for RL. After the user provides a few examples of desired outcomes, MURAL automatically infers a reward function that takes into account these examples and the agent’s uncertainty for each state. Although reinforcement learning has shown success in domains such as robotics, chip placement and playing…

UC Berkeley – Sequence Modeling Solutions
for Reinforcement Learning Problems

Sequence Modeling Solutions for Reinforcement Learning Problems Long-horizon predictions of (top) the Trajectory Transformer compared to those of (bottom) a single-step dynamics model. Modern machine learning success stories often have one thing in common: they use methods that scale gracefully with ever-increasing amounts of data. This is particularly clear from recent advances in sequence modeling,…

UC Berkeley – Which Mutual Information Representation Learning Objectives are Sufficient for Control?

Processing raw sensory inputs is crucial for applying deep RL algorithms to real-world problems. For example, autonomous vehicles must make decisions about how to drive safely given information flowing from cameras, radar, and microphones about the conditions of the road, traffic signals, and other cars and pedestrians. However, direct “end-to-end” RL that maps sensor data…