Latest from MIT Tech Review – OpenAI released its advanced voice mode to more people. Here’s how to get it.

OpenAI is broadening access to Advanced Voice Mode, a feature of ChatGPT that allows you to speak more naturally with the AI model. It allows you to interrupt its responses mid-sentence, and can also sense and interpret your emotions based on your tone of voice and adjust its responses accordingly.

These features were teased back in May when OpenAI unveiled GPT-4o but they were not released until July—and then just to an invite-only group. (At least initially, there seem to have been some safety issues with the model; OpenAI gave several WIRED reporters access to the voice mode back in May, but the magazine reported the company “pulled it the next morning, citing safety concerns.”) Users who’ve been able to try it have largely described the model as an impressively fast, dynamic, and realistic voice assistant—which has made its limited access particularly frustrating to some other OpenAI users.

Today is the first time OpenAI has promised to bring the new voice mode to a wide portion of users—here’s what you need to know.

What can it do?

Though ChatGPT currently offers a standard voice mode to paid users, its interactions can be clunky. In the mobile app, for example, you can’t interrupt the model’s often long-winded responses with your voice, only with a tap on the screen. The new version fixes that, and also promises to modify its responses based on the emotion it’s sensing from your voice. Like other versions of ChatGPT, users can also personalize the voice mode by asking the model to remember facts about themselves. The new mode also has improved its pronunciation of words in non-English languages.

AI investor Allie Miller posted a demo of the tool in August, which highlighted a lot of the same strengths of OpenAI’s own release videos: the model is fast and adept at changing its accent, tone, and content to match your needs.

I’m testing the new @OpenAI Advanced Voice Mode and I just snorted with laughter.

In a good way.

Watch the whole thing pic.twitter.com/vSOMzXdwZo

— Allie K. Miller (@alliekmiller) August 2, 2024

The update also adds new voices. Shortly after the launch of GPT-4o, OpenAI was criticized for the similarity between the female voice in its demo videos, named Sky, and that of Scarlett Johansson, who played an AI love interest in the movie Her. OpenAI then removed the voice. Now, it has launched five new voices, named Arbor, Maple, Sol, Spruce, and Vale, which will be available in both the standard and advanced voice modes. MIT Technology Review has not heard them yet, but OpenAI says they were made using professional voice actors from around the world. “We interviewed dozens of actors to find those with the qualities of voices we feel people will enjoy talking to for hours—warm, approachable, inquisitive, with some rich texture and tone,” a company spokesperson says.

Who can access it and when?

For now, OpenAI is rolling out access to Advanced Voice Mode to Plus users, who pay $20 per month for a premium version, and Team users, who pay $30 per month and have higher message limits. The next group to receive access will be those in Enterprise and Edu tiers. The exact timing, though, is vague; an OpenAI spokesperson says the company will “gradually roll out access to all Plus and Team users and will roll out to Enterprise and Edu tiers starting next week.” The company hasn’t committed to a firm deadline of when all users in these categories will have access. A message in the ChatGPT app indicates that all Plus users will have access by “the end of fall.”

There are geographic limitations. The new feature is not yet available in the EU, the UK, Switzerland, Iceland, Norway, and Liechtenstein.

There is no immediate plan to release Advanced Voice Mode to free users. (The standard mode remains available to all paid users.)

What steps have been taken to make sure it’s safe?

As the company noted upon the initial release in July and again emphasized this week, Advanced Voice Mode has been safety-tested by external experts “who collectively speak a total of 45 different languages, and represent 29 different geographies.” The GPT-4o system card details how the underlying model handles issues like generating violent or erotic speech, imitating voices without their consent, or generating copyrighted content.

Still, OpenAI’s models are not open-source. Compared to such models, which are more transparent about their training data and the “model weights” that govern how the AI produces responses, OpenAI’s closed-source models are harder for independent researchers to evaluate from the perspective of safety, bias, and harm.

Latest from MIT : Understanding the nuances of human-like intelligence

What can we learn about human intelligence by studying how machines “think?” Can we better understand ourselves if we better understand the artificial intelligence systems that are becoming a more significant part of our everyday lives? These questions may be deeply philosophical, but for Phillip Isola, finding the answers is as much about computation as…

Artificial Intelligence

Latest from Google AI – Directing ML toward natural hazard mitigation through collaboration

Posted by Oren Gilon, Software Engineer, and Grey Nearing, Research Scientist, Google Research Floods are the most common type of natural disaster, affecting more than 250 million people globally each year. As part of Google’s Crisis Response and our efforts to address the climate crisis, we are using machine learning (ML) models for Flood Forecasting…

Artificial Intelligence

Latest from MIT Tech Review – The era of AI persuasion in elections is about to begin

In January 2024, the phone rang in homes all around New Hampshire. On the other end was Joe Biden’s voice, urging Democrats to “save your vote” by skipping the primary. It sounded authentic, but it wasn’t. The call was a fake, generated by artificial intelligence. Today, the technology behind that hoax looks quaint. Tools like…

Artificial Intelligence

Latest from MIT : Machine learning facilitates “turbulence tracking” in fusion reactors

Fusion, which promises practically unlimited, carbon-free energy using the same processes that power the sun, is at the heart of a worldwide research effort that could help mitigate climate change. A multidisciplinary team of researchers is now bringing tools and insights from machine learning to aid this effort. Scientists from MIT and elsewhere have used…

Artificial Intelligence

Latest from MIT : LLMs develop their own understanding of reality as their language abilities improve

Ask a large language model (LLM) like GPT-4 to smell a rain-soaked campsite, and it’ll politely decline. Ask the same system to describe that scent to you, and it’ll wax poetic about “an air thick with anticipation” and “a scent that is both fresh and earthy,” despite having neither prior experience with rain nor a…

Artificial Intelligence

Latest from MIT Tech Review – An OpenAI spinoff has built an AI model that helps robots learn tasks like humans

In the summer of 2021, OpenAI quietly shuttered its robotics team, announcing that progress was being stifled by a lack of data necessary to train robots in how to move and reason using artificial intelligence. Now three of OpenAI’s early research scientists say the startup they spun off in 2017, called Covariant, has solved that…

What can it do?

Who can access it and when?

What steps have been taken to make sure it’s safe?

Similar Posts