Latest from MIT Tech Review – AI just beat a human test for creativity. What does that even mean?

AI is getting better at passing tests designed to measure human creativity. In a study published in Nature Scientific Reports today, AI chatbots achieved higher average scores than humans in the Alternate Uses Task, a test commonly used to assess this ability.

This study will add fuel to an ongoing debate among AI researchers about what it even means for a computer to pass tests devised for humans. The findings do not necessarily indicate that AIs are developing an ability to do something uniquely human. It could just be that AIs can pass creativity tests, not that they’re actually creative in the way we understand. However, research like this might give us a better understanding of how humans and machines approach creative tasks.

Researchers started by asking three AI chatbots—OpenAI’s ChatGPT and GPT-4 as well as Copy.Ai, which is built on GPT-3—to come up with as many uses for a rope, a box, a pencil, and a candle as possible within just 30 seconds.

Their prompts instructed the large language models to come up with original and creative uses for each of the items, explaining that the quality of the ideas was more important than the quantity. Each chatbot was tested 11 times for each of the four objects. The researchers also gave 256 human participants the same instructions.

The researchers used two methods to assess both AI and human responses. The first was an algorithm that rated how closely the suggested use for the object was to the object’s original purpose. The second involved asking six human assessors (who were unaware that some of the answers had been generated by AI systems) to evaluate each response on a scale of 1 to 5 in terms of how creative and original it was—1 being not at all, and 5 being very. Average scores for both humans and AIs were then calculated.

Although the chatbots’ responses were rated as better than the humans’ on average, the best-scoring human responses were higher.

While the purpose of the study was not to prove that AI systems are capable of replacing humans in creative roles, it raises philosophical questions about the characteristics that are unique to humans, says Simone Grassini, an associate professor of psychology at the University of Bergen, Norway, who co-led the research.

“We’ve shown that in the past few years, technology has taken a very big leap forward when we talk about imitating human behavior,” he says. “These models are continuously evolving.”

Proving that machines can perform well in tasks designed for measuring creativity in humans doesn’t demonstrate that they’re capable of anything approaching original thought, says Ryan Burnell, a senior research associate at the Alan Turing Institute, who was not involved with the research.

The chatbots that were tested are “black boxes,” meaning that we don’t know exactly what data they were trained on, or how they generate their responses, he says. “What’s very plausibly happening here is that a model wasn’t coming up with new creative ideas—it was just drawing on things it’s seen in its training data, which could include this exact Alternate Uses Task,” he explains. “In that case, we’re not measuring creativity. We’re measuring the model’s past knowledge of this kind of task.”

That doesn’t mean that it’s not still useful to compare how machines and humans approach certain problems, says Anna Ivanova, an MIT postdoctoral researcher studying language models, who did not work on the project.

However, we should bear in mind that although chatbots are very good at completing specific requests, slight tweaks like rephrasing a prompt can be enough to stop them from performing as well, she says. Ivanova believes that these kinds of studies should prompt us to examine the link between the task we’re asking AI models to complete and the cognitive capacity we’re trying to measure. “We shouldn’t assume that people and models solve problems in the same way,” she says.

Latest from MIT : New method efficiently safeguards sensitive AI training data

Data privacy comes with a cost. There are security techniques that protect sensitive user data, like customer addresses, from attackers who may attempt to extract them from AI models — but they often make those models less accurate. MIT researchers recently developed a framework, based on a new privacy metric called PAC Privacy, that could…

Artificial Intelligence

Latest from MIT Tech Review – AI in cybersecurity: Yesterday’s promise, today’s reality

For years, we’ve debated the benefits of artificial intelligence (AI) for society, but it wasn’t until now that people can finally see its daily impact. But why now? What changed that’s made AI in 2023 substantially more impactful than before? First, consumer exposure to emerging AI innovations has elevated the subject, increasing acceptance. From songwriting…

Artificial Intelligence

Latest from MIT : MIT and Mass General Brigham launch joint seed program to accelerate innovations in health

Leveraging the strengths of two world-class research institutions, MIT and Mass General Brigham (MGB) recently celebrated the launch of the MIT-MGB Seed Program. The new initiative, which is supported by Analog Devices Inc. (ADI), will fund joint research projects led by researchers at MIT and Mass General Brigham. These collaborative projects will advance research in…

Artificial Intelligence

UC Berkeley – Training Diffusion Models with Reinforcement Learning

Training Diffusion Models with Reinforcement Learning replay Diffusion models have recently emerged as the de facto standard for generating complex, high-dimensional outputs. You may know them for their ability to produce stunning AI art and hyper-realistic synthetic images, but they have also found success in other applications such as drug design and continuous control. The key…

Artificial Intelligence

Latest from MIT Tech Review – AI companies promised the White House to self-regulate one year ago. What’s changed?

One year ago, on July 21, 2023, seven leading AI companies—Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI—committed with the White House to a set of eight voluntary commitments on how to develop AI in a safe and trustworthy way. These included promises to do things like improve the testing and transparency around AI systems,…

Artificial Intelligence

Latest from MIT Tech Review – The ChatGPT-fueled battle for search is bigger than Microsoft or Google

It’s a good time to be a search startup. When I spoke to Richard Socher, the CEO of You.com, last week he was buzzing: “Man, what an exciting day—looks like another record for us,” he exclaimed. “Never had this many users. It’s been a whirlwind.” You wouldn’t know that two of the biggest firms in…

Similar Posts