Latest from MIT Tech Review – AI that makes images: 10 Breakthrough Technologies 2023

OpenAI introduced a world of weird and wonderful mash-ups when its text-to-image model DALL-E was released in 2021. Type in a short description of pretty much anything, and the program spat out a picture of what you asked for in seconds. DALL-E 2, unveiled in April 2022, was a massive leap forward. Google also launched its own image-making AI, called Imagen.

Yet the biggest game-changer was Stable Diffusion, an open-source text-to-image model released for free by UK-based startup Stability AI in August. Not only could Stable Diffusion produce some of the most stunning images yet, but it was designed to run on a (good) home computer.

By making text-to-image models accessible to all, Stability AI poured fuel on what was already an inferno of creativity and innovation. Millions of people have created tens of millions of images in just a few months. But there are problems, too. Artists are caught in the middle of one of the biggest upheavals in a decade. And, just like language models, text-to-image generators can amplify the biased and toxic associations buried in training data scraped from the internet.

The tech is now being built into commercial software, such as Photoshop. Visual-effects artists and video-game studios are exploring how it can fast-track development pipelines. And text-to-image technology has already advanced to text-to-video. The AI-generated video clips demoed by Google, Meta, and others in the last few months are only seconds long, but that will change. One day movies could be made just by feeding a script into a computer.

Nothing else in AI grabbed people’s attention more last year—for the best and worst reasons. Now we wait to see what lasting impact these tools will have on creative industries—and the entire field of AI.

No one knows where the rise of generative AI will leave us. Read more here.

Artificial Intelligence

Latest from MIT : AI to help researchers see the bigger picture in cell biology

Studying gene expression in a cancer patient’s cells can help clinical biologists understand the cancer’s origin and predict the success of different treatments. But cells are complex and contain many layers, so how the biologist conducts measurements affects which data they can obtain. For instance, measuring proteins in a cell could yield different information about the…

Artificial Intelligence

O’Reilly Media – The Cognitive Shortcut Paradox

This article is part of a series on the Sens-AI Framework—practical habits for learning and coding with AI. AI gives novice developers the ability to skip the slow, messy parts of learning. For experienced developers, that can mean getting to a working solution faster. Developers early in their learning path, however, face what I call…

Artificial Intelligence

Latest from MIT : MIT’s FutureMakers programs help kids get their minds around — and hands on — AI

As she was looking for a camp last summer, Yabesra Ewnetu, who’d just finished eighth grade, found a reference to MIT’s FutureMakers Create-a-thon. Ewnetu had heard that it’s hard to detect bias in artificial intelligence because AI algorithms are so complex, but this didn’t make sense to her. “I was like, well, we’re the ones…

Artificial Intelligence

Latest from MIT Tech Review – Here’s our forecast for AI this year

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here. In December, our small but mighty AI reporting team was asked by our editors to make a prediction: What’s coming next for AI? In 2024, AI contributed both to Nobel Prize–winning…

Artificial Intelligence

Latest from Google AI – Nested Hierarchical Transformer: Towards Accurate, Data-Efficient, and Interpretable Visual Understanding

Posted by Zizhao Zhang, Software Engineer, Google Cloud In visual understanding, the Visual Transformer (ViT) and its variants have received significant attention recently due to their superior performance on many core visual applications, such as image classification, object detection, and video understanding. The core idea of ViT is to utilize the power of self-attention layers…

Artificial Intelligence

Latest from MIT Tech Review – Mechanistic interpretability: 10 Breakthrough Technologies 2026

Hundreds of millions of people now use chatbots every day. And yet the large language models that drive them are so complicated that nobody really understands what they are, how they work, or exactly what they can and can’t do—not even the people who build them. Weird, right? It’s also a problem. Without a clear…

Similar Posts