Latest from MIT Tech Review – Meta’s new learning algorithm can teach AI to multi-task

If you can recognize a dog by sight, then you can probably recognize a dog when it is described to you in words. Not so for today’s artificial intelligence. Deep neural networks have become very good at identifying objects in photos and conversing in natural language, but not at the same time: there are AI models that excel at one or the other, but not both.

Part of the problem is that these models learn different skills using different techniques. This is a major obstacle for the development of more general-purpose AI, machines that can multi-task and adapt. It also means that advances in deep learning for one skill often do not transfer to others.

A team at Meta AI (previously Facebook AI Research) wants to change that. The researchers have developed a single algorithm that can be used to train a neural network to recognize images, text, or speech. The algorithm, called Data2vec, not only unifies the learning process but performs at least as well as existing techniques in all three skills. “We hope it will change the way people think about doing this type of work,” says Michael Auli, a researcher at Meta AI.

The research builds on an approach known as self-supervised learning, in which neural networks learn to spot patterns in data sets by themselves, without being guided by labeled examples. This is how large language models like GPT-3 learn from vast bodies of unlabeled text scraped from the internet, and it has driven many of the recent advances in deep learning.

Auli and his colleagues at Meta AI had been working on self-supervised learning for speech recognition. But when they looked at what other researchers were doing with self-supervised learning for images and text, they realized that they were all using different techniques to chase the same goals.

Data2vec uses two neural networks, a student and a teacher. First, the teacher network is trained on images, text, or speech in the usual way, learning an internal representation of this data that allows it to predict what it is seeing when shown new examples. When it is shown a photo of a dog, it recognizes it as a dog.

The twist is that the student network is then trained to predict the internal representations of the teacher. In other words, it is trained not to guess that it is looking at a photo of a dog when shown a dog, but to guess what the teacher sees when shown that image.

Because the student does not try to guess the actual image or sentence but, rather, the teacher’s representation of that image or sentence, the algorithm does not need to be tailored to a particular type of input.

Data2vec is part of a big trend in AI toward models that can learn to understand the world in more than one way. “It’s a clever idea,” says Ani Kembhavi at the Allen Institute for AI in Seattle, who works on vision and language. “It’s a promising advance when it comes to generalized systems for learning.”

An important caveat is that although the same learning algorithm can be used for different skills, it can only learn one skill at a time. Once it has learned to recognize images, it must start from scratch to learn to recognize speech. Giving an AI multiple skills at once is hard, but that’s something the Meta AI team wants to look at next.

The researchers were surprised to find that their approach actually performed better than existing techniques at recognizing images and speech, and performed as well as leading language models on text understanding.

Mark Zuckerberg is already dreaming up potential metaverse applications. “This will all eventually get built into AR glasses with an AI assistant,” he posted to Facebook today. “It could help you cook dinner, noticing if you miss an ingredient, prompting you to turn down the heat, or more complex tasks.”

For Auli, the main takeaway is that researchers should step out of their silos. “Hey, you don’t need to focus on one thing,” he says. “If you have a good idea, it might actually help across the board.”

Latest from MIT : A smarter way to develop new drugs

Pharmaceutical companies are using artificial intelligence to streamline the process of discovering new medicines. Machine-learning models can propose new molecules that have specific properties which could fight certain diseases, doing in minutes what might take humans months to achieve manually. But there’s a major hurdle that holds these systems back: The models often suggest new…

Artificial Intelligence

Latest from Google AI – Overcoming leakage on error-corrected quantum processors

Posted by Kevin Miao and Matt McEwen, Research Scientists, Quantum AI Team The qubits that make up Google quantum devices are delicate and noisy, so it’s necessary to incorporate error correction procedures that identify and account for qubit errors on the way to building a useful quantum computer. Two of the most prevalent error mechanisms…

Artificial Intelligence

Latest from Google AI – Digitizing Smell: Using Molecular Maps to Understand Odor

Posted by Richard C. Gerkin, Google Research, and Alexander B. Wiltschko, Google Did you ever try to measure a smell? …Until you can measure their likenesses and differences you can have no science of odor. If you are ambitious to found a new science, measure a smell.— Alexander Graham Bell, 1914. How can we measure…

Artificial Intelligence

Latest from MIT Tech Review – Open-sourcing generative AI

The views expressed in this video are those of the speakers, and do not represent any endorsement or sponsorship. Is the open-source approach, which has democratized access to software, ensured transparency, and improved security for decades, now poised to have a similar impact on AI? We dissect the balance between collaboration and control, legal ramifications,…

Artificial Intelligence

Latest from MIT Tech Review – Google is throwing generative AI at everything

Google is stuffing powerful new AI tools into tons of its existing products and launching a slew of new ones, including a coding assistant, it announced at its annual I/O conference today. Billions of users will soon see Google’s latest AI language mode, PaLM 2, integrated into over 25 products like Maps, Docs, Gmail, Sheets,…

Artificial Intelligence

Latest from MIT : Q&A: Chris Rackauckas on the equations at the heart of practically everything

Some people pass the time with hobbies like crossword puzzles or Sudoku. When Chris Rackauckas has a spare moment, he often uses it to answer questions about numerical differential equations that people have posed online. Rackauckas — previously an MIT applied mathematics instructor, now an MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) research affiliate…

Similar Posts