Latest from MIT Tech Review – Apple researchers explore dropping “Siri” phrase & listening with AI instead

Researchers from Apple are probing whether it’s possible to use artificial intelligence to detect when a user is speaking to a device like an iPhone, thereby eliminating the technical need for a trigger phrase like “Siri,” according to a paper published on Friday.

In a study, which was uploaded to Arxiv and has not been peer-reviewed, researchers trained a large language model using both speech captured by smartphones as well as acoustic data from background noise to look for patterns that could indicate when they want help from the device. The model was built in part with a version of OpenAI’s GPT-2, “since it is relatively lightweight and can potentially run on devices such as smartphones,” the researchers wrote. The paper describes over 129 hours of data and additional text data used to train the model, but did not specify the source of the recordings that went into the training set. Six of the seven authors list their affiliation as Apple, and three of them work on the company’s Siri team according to their LinkedIn profiles. (The seventh author did work related to the paper during an Apple internship.)

The results were promising, according to the paper. The model was able to make more accurate predictions than audio-only or text-only models, and improved further as the size of the models grew larger. Beyond exploring the research question, it’s unclear if Apple plans to eliminate the “Hey Siri” trigger phrase.

Neither Apple, nor the paper’s researchers immediately returned requests for comment.

Currently, Siri functions by holding small amounts of audio and does not begin recording or preparing to answer user prompts until it hears the trigger phrase. Eliminating that “Hey Siri” prompt could increase concerns about our devices “always listening”, said Jen King, a privacy and data policy fellow at the Stanford Institute for Human-Centered Artificial Intelligence.

The way Apple handles audio data has previously come under scrutiny by privacy advocates. In 2019, reporting from The Guardian revealed that Apple’s quality control contractors regularly heard private audio collected from iPhones while they worked with Siri data, including sensitive conversations between doctors and patients. Two years later, Apple responded with policy changes, including storing more data on devices and allowing users to opt-out of allowing their recordings to be used to improve Siri. A class action suit was brought against the company in California in 2021 that alleged Siri is being turned on even when not activated.

The “Hey Siri” prompt can serve an important purpose for users, according to King. The phrases provide a way to know when the device is listening, and getting rid of that might mean more convenience, but less transparency from the device, King told MIT Technology Review. The research did not detail if the trigger phrase would be replaced by any other signal that the AI assistant is engaged.

“I’m skeptical that a company should mandate that form of interaction,” King says.

The paper is one of a number of recent signals that Apple, which is perceived to be lagging behind other tech giants like Amazon, Google, and Facebook in the artificial intelligence race, is planning to incorporate more AI into its products. According to news first reported by VentureBeat, Apple is building a generative AI model called MM1 that can work in text and images, which would be the company’s answer to Open AI’s ChatGPT and a host of other chatbots by leading tech giants. Meanwhile, Bloomberg reported that Apple is in talks with Google about using the company’s AI model Gemini in iPhones, and on Friday the Wall Street Journal reported that it had engaged in talks with Baidu about using that company’s AI products.

Latest from MIT Tech Review – Imagining the future of banking with agentic AI

Agentic AI is coming of age. And with it comes new opportunities in the financial services sector. Banks are increasingly employing agentic AI to optimize processes, navigate complex systems, and sift through vast quantities of unstructured data to make decisions and take actions—with or without human involvement. “With the maturing of agentic AI, it is…

Artificial Intelligence

UC Berkeley – Whole-Body Conditioned Egocentric Video Prediction

× Predicting Ego-centric Video from human Actions (PEVA). Given past video frames and an action specifying a desired change in 3D pose, PEVA predicts the next video frame. Our results show that, given the first frame and a sequence of actions, our model can generate videos of atomic actions (a), simulate counterfactuals (b), and support…

Artificial Intelligence

Latest from Google AI – Open Images V7 — Now Featuring Point Labels

Posted by Rodrigo Benenson, Research Scientist, Google Research Open Images is a computer vision dataset covering ~9 million images with labels spanning thousands of object categories. Researchers around the world use Open Images to train and evaluate computer vision models. Since the initial release of Open Images in 2016, which included image-level labels covering 6k…

Artificial Intelligence

UC Berkeley – What exactly does word2vec learn?

What exactly does word2vec learn, and how? Answering this question amounts to understanding representation learning in a minimal yet interesting language modeling task. Despite the fact that word2vec is a well-known precursor to modern language models, for many years, researchers lacked a quantitative and predictive theory describing its learning process. In our new paper, we…

Artificial Intelligence

Latest from MIT : AI generates high-quality images 30 times faster in a single step

In our current age of artificial intelligence, computers can generate their own “art” by way of diffusion models, iteratively adding structure to a noisy initial state until a clear image or video emerges. Diffusion models have suddenly grabbed a seat at everyone’s table: Enter a few words and experience instantaneous, dopamine-spiking dreamscapes at the intersection…

Artificial Intelligence

Latest from MIT Tech Review – Scaling customer experiences with data and AI

Today, interactions matter more than ever. According to data compiled by NICE, once a consumer makes a buying decision for a product or service, 80% of their decision to keep doing business with that brand hinges on the quality of their customer service experience, according to NICE research. Enter AI. “I think AI is becoming…

Similar Posts