UC Berkeley – Should I Use Offline RL or Imitation Learning?

Figure 1: Summary of our recommendations for when a practitioner should BC and various imitation learning style methods, and when they should use offline RL approaches. Offline reinforcement learning allows learning policies from previously collected data, which has profound implications for applying RL in domains where running trial-and-error learning is impractical or dangerous, such as…

Latest from MIT : An easier way to teach robots new skills

With e-commerce orders pouring in, a warehouse robot picks mugs off a shelf and places them into boxes for shipping. Everything is humming along, until the warehouse processes a change and the robot must now grasp taller, narrower mugs that are stored upside down. Reprogramming that robot involves hand-labeling thousands of images that show it…

Latest from Google AI – Pix2Seq: A New Language Interface for Object Detection

Posted by Ting Chen and David Fleet, Research Scientists, Google Research, Brain Team Object detection is a long-standing computer vision task that attempts to recognize and localize all objects of interest in an image. The complexity arises when trying to identify or localize all object instances while also avoiding duplication. Existing approaches, like Faster R-CNN…

Latest from MIT Tech Review – The Download: Language-preserving AI, and hackers showed it’s frighteningly easy to breach critical infrastructure

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. A new vision of artificial intelligence for the people In the back room of an old building in New Zealand, one of the most advanced computers for artificial intelligence is helping to redefine…

Latest from MIT Tech Review – A new vision of artificial intelligence for the people

In the back room of an old and graying building in the northernmost region of New Zealand, one of the most advanced computers for artificial intelligence is helping to redefine the technology’s future. Te Hiku Media, a nonprofit Māori radio station run by life partners Peter-Lucas Jones and Keoni Mahelona, bought the machine at a…

Latest from Google AI – Hidden Interfaces for Ambient Computing

Posted by Alex Olwal, Research Scientist, Google Augmented Reality and Artem Dementyev, Hardware Engineer, Google Research As consumer electronics and internet-connected appliances are becoming more common, homes are beginning to embrace various types of connected devices that offer functionality like music control, voice assistance, and home automation. A graceful integration of devices requires adaptation to…

Latest from MIT : A new state of the art for unsupervised vision

Labeling data can be a chore. It’s the main source of sustenance for computer-vision models; without it, they’d have a lot of difficulty identifying objects, people, and other important image characteristics. Yet producing just an hour of tagged and labeled data can take a whopping 800 hours of human time. Our high-fidelity understanding of the…

Latest from MIT Tech Review – The gig workers fighting back against the algorithms

In the Bendungan Hilir neighborhood, just a stone’s throw from Jakarta’s glitzy central business district, a long row of makeshift wooden stalls crammed onto the sidewalk serves noodle soup, fried rice, and cigarettes to locals. One place stands out in particular, buzzing with motorcycle drivers clad in green. It’s an informal “base camp,” or meeting…

UC Berkeley – Offline RL Made Easier: No TD Learning, Advantage Reweighting, or Transformers

A demonstration of the RvS policy we learn with just supervised learning and a depth-two MLP. It uses no TD learning, advantage reweighting, or Transformers! Offline reinforcement learning (RL) is conventionally approached using value-based methods based on temporal difference (TD) learning. However, many recent algorithms reframe RL as a supervised learning problem. These algorithms learn…

Latest from MIT : Anticipating others’ behavior on the road

Humans may be one of the biggest roadblocks keeping fully autonomous vehicles off city streets. If a robot is going to navigate a vehicle safely through downtown Boston, it must be able to predict what nearby drivers, cyclists, and pedestrians are going to do next. Behavior prediction is a tough problem, however, and current artificial…