Latest from MIT Tech Review – Google DeepMind’s new generative model makes Super Mario-like games from scratch

OpenAI’s recent reveal of its stunning generative model Sora pushed the envelope of what’s possible with text-to-video. Now Google DeepMind brings us text-to-video games.

The new model, called Genie, can take a short description, a hand-drawn sketch or a photo and turn it into a playable video game in the style of classic 2D platformers like Super Mario Bros. But don’t expect anything fast-paced. The games run at one frame per second, compared to the typical 30-60 frames per second of most modern games.

“It’s cool work,” says Matthew Gudzial, an AI researcher at the University of Alberta, who developed a similar game generator a few years ago.

Genie was trained on 30,000 hours of video of hundreds of 2D platform games taken from the internet. Others have taken that approach before, says Gudzial. His own game generator learned from videos to create abstract platformers. Nivida used video data to train a model called GameGAN, which could produce clones of games like Pac-Man.

But all of these examples trained the model with input actions, button presses on a games controller, as well as video footage: a video frame showing Mario jumping was paired with the “jump” action, and so on. Tagging video footage with input actions takes a lot of work, however. This has limited the amount of training data available.

In contrast, Genie was trained on video footage alone. It then learned which of eight possible actions would cause the game character in a video to change its position. This turned countless hours of existing online video into potential training data.

Genie can generate simple games from hand-drawn sketches

GOOGLE DEEPMIND

Genie generates each new frame of the game on the fly depending on the action the player takes. Press jump and Genie updates the current image to show the game character jumping; press left and the image changes to show the character moved to the left. The game ticks along action by action, each new frame generated from scratch as the player plays.

Future versions of Genie could run faster. “There is no fundamental limitation that prevents us from reaching 30 frames per second,” says Tim Rocktäschel, a research scientist at Google DeepMind who leads the team behind the work. “Genie uses many of the same technologies as contemporary large language models, where there has been significant progress in improving inference speed.”

Genie learned some common visual quirks found in platformers. Many games of this type use parallax, where the foreground moves sideways faster than the background. Genie often adds this effect to the games it generates.

While Genie is an in-house research project and won’t be released, Gudzial notes that the Google DeepMind team says it could one day be turned into a game-making tool—something he’s working on too. “I’m definitely interested to see what they build,” he says.

Virtual playgrounds

But the Google DeepMind researchers are interested in more than just game generation, however. The team behind Genie works on open-ended learning, where AI-controlled bots are dropped into a virtual environment and left to learn how to solve various tasks by trial and error (a technique known as reinforcement learning).

In 2021, the team developed a virtual playground called XLand, in which bots learned how to cooperate to solve simple tasks such as moving obstacles. Virtual environments like XLand will be crucial for training future bots on a range of different challenges before pitting them against real-world scenarios. The video game example proves that Genie can produce these virtual sandboxes for bots to play in.

Others have developed similar world-building tools. For example, David Ha at Google Brain and Jürgen Schmidhuber at the AI lab IDSIA in Switzerland developed a tool in 2018 that trained bots in game-based virtual environments called world models. But, again, unlike Genie, these required the training data to include input actions.

The team demonstrated how this ability is useful in robotics too. By showing Genie videos of real robot arms manipulating a variety of household objects, the model learned what actions that arm could do and how to control it. Future robots could learn new tasks by watching video tutorials.

“It is hard to predict what use cases will be enabled,” says Rocktäschel. “We hope projects like Genie will eventually provide people with new tools to express their creativity.”

Latest from MIT : Injecting fairness into machine-learning models

If a machine-learning model is trained using an unbalanced dataset, such as one that contains far more images of people with lighter skin than people with darker skin, there is serious risk the model’s predictions will be unfair when it is deployed in the real world. But this is only one part of the problem….

Artificial Intelligence

Latest from Google AI – Deciphering Clinical Abbreviations with Privacy Protecting ML

Posted by Posted by Alvin Rajkomar, Research Scientist, and Eric Loreaux, Software Engineer, Google Research Today many people have digital access to their medical records, including their doctor’s clinical notes. However, clinical notes are hard to understand because of the specialized language that clinicians use, which contains unfamiliar shorthand and abbreviations. In fact, there are…

Artificial Intelligence

Latest from MIT Tech Review – The new version of GPT-3 is much better behaved (and should be less toxic)

OpenAI has built a new version of GPT-3, its game-changing language model, that it says does away with some of the most toxic issues that plagued its predecessor. The San Francisco-based lab says the updated model, called InstructGPT, is better at following the instructions of people using it—known as “alignment” in AI jargon—and thus produces less offensive…

Artificial Intelligence

Latest from MIT : Gamifying medical data labeling to advance AI

When Erik Duhaime PhD ’19 was working on his thesis in MIT’s Center for Collective Intelligence, he noticed his wife, then a medical student, spending hours studying on apps that offered flash cards and quizzes. His research had shown that, as a group, medical students could classify skin lesions more accurately than professional dermatologists; the…

Artificial Intelligence

Latest from MIT : Using AI, scientists find a drug that could combat drug-resistant infections

Using an artificial intelligence algorithm, researchers at MIT and McMaster University have identified a new antibiotic that can kill a type of bacteria that is responsible for many drug-resistant infections. If developed for use in patients, the drug could help to combat Acinetobacter baumannii, a species of bacteria that is often found in hospitals and…

Artificial Intelligence

Latest from Google AI – AudioLM: a Language Modeling Approach to Audio Generation

Posted by Zalán Borsos, Research Software Engineer, and Neil Zeghidour, Research Scientist, Google Research Generating realistic audio requires modeling information represented at different scales. For example, just as music builds complex musical phrases from individual notes, speech combines temporally local structures, such as phonemes or syllables, into words and sentences. Creating well-structured and coherent audio…

Virtual playgrounds

Similar Posts