Latest from MIT Tech Review – Meta has created a way to watermark AI-generated speech

Meta has created a system that can embed hidden signals, known as watermarks, in AI-generated audio clips, which could help in detecting AI-generated content online.

The tool, called AudioSeal, is the first that can pinpoint which bits of audio in, for example, a full hourlong podcast might have been generated by AI. It could help to tackle the growing problem of misinformation and scams using voice cloning tools, says Hady Elsahar, a research scientist at Meta. Malicious actors have used generative AI to create audio deepfakes of President Joe Biden, and scammers have used deepfakes to blackmail their victims. Watermarks could in theory help social media companies detect and remove unwanted content.

However, there are some big caveats. Meta says it has no plans yet to apply the watermarks to AI-generated audio created using its tools. Audio watermarks are not yet adopted widely, and there is no single agreed industry standard for them. And watermarks for AI-generated content tend to be easy to tamper with—for example, by removing or forging them.

Fast detection, and the ability to pinpoint which elements of an audio file are AI-generated, will be critical to making the system useful, says Elsahar. He says the team achieved between 90% and 100% accuracy in detecting the watermarks, much better results than in previous attempts at watermarking audio.

AudioSeal is available on GitHub for free. Anyone can download it and use it to add watermarks to AI-generated audio clips. It could eventually be overlaid on top of AI audio generation models, so that it is automatically applied to any speech generated using them. The researchers who created it will present their work at the International Conference on Machine Learning in Vienna, Austria, in July.

AudioSeal is created using two neural networks. One generates watermarking signals that can be embedded into audio tracks. These signals are imperceptible to the human ear but can be detected quickly using the other neural network. Currently, if you want to try to spot AI-generated audio in a longer clip, you have to comb through the entire thing in second-long chunks to see if any of them contain a watermark. This is a slow and laborious process, and not practical on social media platforms with millions of minutes of speech.

AudioSeal works differently: by embedding a watermark throughout each section of the entire audio track. This allows the watermark to be “localized,” which means it can still be detected even if the audio is cropped or edited.

Ben Zhao, a computer science professor at the University of Chicago, says this ability, and the near-perfect detection accuracy, makes AudioSEAL better than any previous audio watermarking system he’s come across.

“It’s meaningful to explore research improving the state of the art in watermarking, especially across mediums like speech that are often harder to mark and detect than visual content,” says Claire Leibowicz, head of AI and media integrity at the nonprofit Partnership on AI.

But there are some major flaws that need to be overcome before these sorts of audio watermarks can be adopted en masse. Meta’s researchers tested different attacks to remove the watermarks and found that the more information is disclosed about the watermarking algorithm, the more vulnerable it is. The system also requires people to voluntarily add the watermark to their audio files.

This places some fundamental limitations on the tool, says Zhao. “Where the attacker has some access to the [watermark] detector, it’s pretty fragile,” he says. And this means only Meta will be able to verify whether audio content is AI-generated or not.

Leibowicz says she remains unconvinced that watermarks will actually further public trust in the information they’re seeing or hearing, despite their popularity as a solution in the tech sector. That’s partly because they are themselves so open to abuse.

“I’m skeptical that any watermark will be robust to adversarial stripping and forgery,” she adds.

Latest from Google AI – StyleDrop: Text-to-image generation in any style

Posted by Kihyuk Sohn and Dilip Krishnan, Research Scientists, Google Research Text-to-image models trained on large volumes of image-text pairs have enabled the creation of rich and diverse images encompassing many genres and themes. Moreover, popular styles such as “anime” or “steampunk”, when added to the input text prompt, may translate to specific visual outputs….

Artificial Intelligence

Latest from MIT : Strengthening electron-triggered light emission

The way electrons interact with photons of light is a key part of many modern technologies, from lasers to solar panels to LEDs. But the interaction is inherently a weak one because of a major mismatch in scale: A wavelength of visible light is about 1,000 times larger than an electron, so the way the…

Artificial Intelligence

UC Berkeley – imodels: leveraging the unreasonable effectiveness of rules

imodels: A python package with cutting-edge techniques for concise, transparent, and accurate predictive modeling. All sklearn-compatible and easy to use. Recent machine-learning advances have led to increasingly complex predictive models, often at the cost of interpretability. We often need interpretability, particularly in high-stakes applications such as medicine, biology, and political science (see here and here…

Artificial Intelligence

Latest from Google AI – Infinite Nature: Generating 3D Flythroughs from Still Photos

Posted by Noah Snavely and Zhengqi Li, Research Scientists, Google Research We live in a world of great natural beauty — of majestic mountains, dramatic seascapes, and serene forests. Imagine seeing this beauty as a bird does, flying past richly detailed, three-dimensional landscapes. Can computers learn to synthesize this kind of visual experience? Such a…

Artificial Intelligence

Latest from MIT : Expanding robot perception

Robots have come a long way since the Roomba. Today, drones are starting to deliver door to door, self-driving cars are navigating some roads, robo-dogs are aiding first responders, and still more bots are doing backflips and helping out on the factory floor. Still, Luca Carlone thinks the best is yet to come. Carlone, who…

Artificial Intelligence

Latest from MIT : Engineers use artificial intelligence to capture the complexity of breaking waves

Waves break once they swell to a critical height, before cresting and crashing into a spray of droplets and bubbles. These waves can be as large as a surfer’s point break and as small as a gentle ripple rolling to shore. For decades, the dynamics of how and when a wave breaks have been too…

Similar Posts