When Greg unboxed a new Roomba robot vacuum cleaner in December 2019, he thought he knew what he was getting into.
He would allow the preproduction test version of iRobot’s Roomba J series device to roam around his house, let it collect all sorts of data to help improve its artificial intelligence, and provide feedback to iRobot about his user experience.
He had done this all before. Outside of his day job as an engineer at a software company, Greg had been beta-testing products for the past decade. He estimates that he’s tested over 50 products in that time—everything from sneakers to smart home cameras.
“I really enjoy it,” he says. “The whole idea is that you get to learn about something new, and hopefully be involved in shaping the product, whether it’s making a better-quality release or actually defining features and functionality.”
But what Greg didn’t know—and does not believe he consented to—was that iRobot would share test users’ data in a sprawling, global data supply chain, where everything (and every person) captured by the devices’ front-facing cameras could be seen, and perhaps annotated, by low-paid contractors outside the United States who could screenshot and share images at their will.
Greg, who asked that we identify him only by his first name because he signed a nondisclosure agreement with iRobot, is not the only test user who feels dismayed and betrayed.
Nearly a dozen people who participated in iRobot’s data collection efforts between 2019 and 2022 have come forward in the weeks since MIT Technology Review published an investigation into how the company uses images captured from inside real homes to train its artificial intelligence. The participants have shared similar concerns about how iRobot handled their data—and whether those practices conform with the company’s own data protection promises. After all, the agreements go both ways, and whether or not the company legally violated its promises, the participants feel misled.
“There is a real concern about whether the company is being deceptive if people are signing up for this sort of highly invasive type of surveillance and never fully understand … what they’re agreeing to,” says Albert Fox Cahn, the executive director of the Surveillance Technology Oversight Project.
The company’s failure to adequately protect test user data feels like “a clear breach of the agreement on their side,” Greg says. It’s “a failure … [and] also a violation of trust.”
Now, he wonders, “where is the accountability?”
The blurry line between testers and consumers
Last month MIT Technology Review revealed how iRobot collects photos and videos from the homes of test users and employees and shares them with data annotation companies, including San Francisco–based Scale AI, which hire far-flung contractors to label the data that trains the company’s artificial-intelligence algorithms.
We found that in one 2020 project, gig workers in Venezuela were asked to label objects in a series of images of home interiors, some of which included individuals—their faces visible to the data annotators. These workers then shared at least 15 images—including shots of a minor and of a woman sitting on the toilet—to social media groups where they gathered to talk shop. We know about these particular images because the screenshots were subsequently shared with us, but our interviews with data annotators and researchers who study data annotation suggest they are unlikely to be the only ones that made their way online; it’s not uncommon for sensitive images, videos, and audio to be shared with labelers.
Shortly after MIT Technology Review contacted iRobot for comment on the photos last fall, the company terminated its contract with Scale AI.
Nevertheless, in a LinkedIn post in response to our story, iRobot CEO Colin Angle said the mere fact that these images, and the faces of test users, were visible to human gig workers was not a reason for concern. Rather, he wrote, making such images available was actually necessary to train iRobot’s object recognition algorithms: “How do our robots get so smart? It starts during the development process, and as part of that, through the collection of data to train machine learning algorithms.” Besides, he pointed out, the images came not from customers but from “paid data collectors and employees” who had signed consent agreements.
In the LinkedIn post and in statements to MIT Technology Review, Angle and iRobot have repeatedly emphasized that no customer data was shared and that “participants are informed and acknowledge how the data will be collected.”
This attempt to clearly delineate between customers and beta testers—and how those people’s data will be treated—has been confounding to many testers, who say they consider themselves part of iRobot’s broader community and feel that the company’s comments are dismissive. Greg and the other testers who reached out also strongly dispute any implication that by volunteering to test a product, they have signed away all their privacy.
What’s more, the line between tester and consumer is not so clear cut. At least one of the testers we spoke with enjoyed his test Roomba so much that he later purchased the device.
This is not an anomaly; rather, converting beta testers to customers and evangelists for the product is something Centercode, the company that recruited the participants on behalf of iRobot, actively tries to promote: “It’s hard to find better potential brand ambassadors than in your beta tester community. They’re a great pool of free, authentic voices that can talk about your launched product to the world, and their (likely techie) friends,” it wrote in a marketing blog post.
To Greg, iRobot has “failed spectacularly” in its treatment of the testing community, particularly in its silence over the privacy breach. iRobot says it has notified individuals whose photos appeared in the set of 15 images, but it did not respond to a question about whether it would notify other individuals who had taken part in its data collection. The participants who reached out to us said they have not received any kind of notice from the company.
“If your credit card information … was stolen at Target, Target doesn’t notify the one person who has the breach,” he adds. “They send out a notification that there was a breach, this is what happened, [and] this is how they’re handling it.”
Inside the world of beta testing
The journey of iRobot’s AI-powering data points starts on testing platforms like Betabound, which is run by Centercode. The technology company, based in Laguna Hills, California, recruits volunteers to test out products and services for its clients—primarily consumer tech companies. (iRobot spokesperson James Baussmann confirmed that the company has used Betabound but said that “not all of the paid data collectors were recruited via Betabound.” Centercode did not respond to multiple requests for comment.)
“If your credit card information … was stolen at Target, Target doesn’t notify the one person who has the breach.”
As early adopters, beta testers are often more tech savvy than the average consumer. They are enthusiastic about gadgets and, like Greg, sometimes work in the technology sector themselves—so they are often well aware of the standards around data protection.
A review of all 6,200 test opportunities listed on Betabound’s website as of late December shows that iRobot has been testing on the platform since at least 2017. The latest project, which is specifically recruiting German testers, started just last month.
iRobot’s vacuums are far from the only devices in its category. There are over 300 tests listed for other “smart” devices powered by AI, including “a smart microwave with Alexa support,” as well as multiple other robot vacuums.
The first step for potential testers is to fill out a profile on the Betabound website. They can then apply for specific opportunities as they’re announced. If accepted by the company running the test, testers sign numerous agreements before they are sent the devices.
Betabound testers are not paid, as the platform’s FAQ for testers notes: “Companies cannot expect your feedback to be honest and reliable if you’re being paid to give it.” Rather, testers might receive gift cards, a chance to keep their test devices free of charge, or complimentary production versions delivered after the device they tested goes to market.
iRobot, however, did not allow testers to keep their devices, nor did they receive final products. Instead, the beta testers told us that they received gift cards in amounts ranging from $30 to $120 for running the robot vacuums multiple times a week over multiple weeks. (Baussmann says that “with respect to the amount paid to participants, it varies depending upon the work involved.”)
For some testers, this compensation was disappointing—“even before considering … my naked ass could now be on the Internet,” as B, a tester we’re identifying only by his first initial, wrote in an email. He called iRobot “cheap bastards” for the $30 gift card that he received for his data, collected daily over three months.
What users are really agreeing to
When MIT Technology Review reached out to iRobot for comment on the set of 15 images last fall, the company emphasized that each image had a corresponding consent agreement. It would not, however, share the agreements with us, citing “legal reasons.” Instead, the company said the agreement required an “acknowledgment that video and images are being captured during cleaning jobs” and that “the agreement encourages paid data collectors to remove anything they deem sensitive from any space the robot operates in, including children.”
Test users have since shared with MIT Technology Review copies of their agreement with iRobot. These include several different forms—including a general Betabound agreement and a “global test agreement for development robots,” as well as agreements on nondisclosure, test participation, and product loan. There are also agreements for some of the specific tests being run.
The text of iRobot’s global test agreement from 2019, copied into a new document to protect the identity of test users.
The forms do contain the language iRobot previously laid out, while also spelling out the company’s own commitments on data protection toward test users. But they provide little clarity on what exactly that means, especially how the company will handle user data after it’s collected and whom the data will be shared with.
The “global test agreement for development robots,” similar versions of which were independently shared by a half-dozen individuals who signed them between 2019 and 2022, contains the bulk of the information on privacy and consent.
In the short document of roughly 1,300 words, iRobot notes that it is the controller of information, which comes with legal responsibilities under the EU’s GDPR to ensure that data is collected for legitimate purposes and securely stored and processed. Additionally, it states, “iRobot agrees that third-party vendors and service providers selected to process [personal information] will be vetted for privacy and data security, will be bound by strict confidentiality, and will be governed by the terms of a Data Processing Agreement,” and that users “may be entitled to additional rights under applicable privacy laws where [they] reside.”
It’s this section of the agreement that Greg believes iRobot breached. “Where in that statement is the accountability that iRobot is proposing to the testers?” he asks. “I completely disagree with how offhandedly this is being responded to.”
“A lot of this language seems to be designed to exempt the company from applicable privacy laws, but none of it reflects the reality of how the product operates.”
What’s more, all test participants had to agree that their data could be used for machine learning and object detection training. Specifically, the global test agreement’s section on “use of research information” required an acknowledgment that “text, video, images, or audio … may be used by iRobot to analyze statistics and usage data, diagnose technology problems, enhance product performance, product and feature innovation, market research, trade presentations, and internal training, including machine learning and object detection.”
What isn’t spelled out here is that iRobot carries out the machine-learning training through human data labelers who teach the algorithms, click by click, to recognize the individual elements captured in the raw data. In other words, the agreements shared with us never explicitly mention that personal images will be seen and analyzed by other humans.
Baussmann, iRobot’s spokesperson, said that the language we highlighted “covers a variety of testing scenarios” and is not specific to images sent for data annotation. “For example, sometimes testers are asked to take photos or videos of a robot’s behavior, such as when it gets stuck on a certain object or won’t completely dock itself, and send those photos or videos to iRobot,” he wrote, adding that “for tests in which images will be captured for annotation purposes, there are specific terms that are outlined in the agreement pertaining to that test.”
He also wrote that “we cannot be sure the people you have spoken with were part of the development work that related to your article,” though he notably did not dispute the veracity of the global test agreement, which ultimately allows all test users’ data to be collected and used for machine learning.
What users really understand
When we asked privacy lawyers and scholars to review the consent agreements and shared with them the test users’ concerns, they saw the documents and the privacy violations that ensued as emblematic of a broken consent framework that affects us all—whether we are beta testers or regular consumers.
Experts say companies are well aware that people rarely read privacy policies closely, if we read them at all. But what iRobot’s global test agreement attests to, says Ben Winters, a lawyer with the Electronic Privacy Information Center who focuses on AI and human rights, is that “even if you do read it, you still don’t get clarity.”
Rather, “a lot of this language seems to be designed to exempt the company from applicable privacy laws, but none of it reflects the reality of how the product operates,” says Cahn, pointing to the robot vacuums’ mobility and the impossibility of controlling where potentially sensitive people or objects—in particular children—are at all times in their own home.
Ultimately, that “place[s] much of the responsibility … on the end user,” notes Jessica Vitak, an information scientist at the University of Maryland’s College of Information Studies who studies best practices in research and consent policies. Yet it doesn’t give them a true accounting of “how things might go wrong,” she says—“which would be very valuable information when deciding whether to participate.”
Not only does it put the onus on the user; it also leaves it to that single person to “unilaterally affirm the consent of every person within the home,” explains Cahn, even though “everyone who lives in a house that uses one of these devices will potentially be put at risk.”
All of this lets the company shirk its true responsibility as a data controller, adds Deirdre Mulligan, a professor in the School of Information at UC Berkeley. “A device manufacturer that is a data controller” can’t simply “offload all responsibility for the privacy implications of the device’s presence in the home to an employee” or other volunteer data collectors.
Some participants did admit that they hadn’t read the consent agreement closely. “I skimmed the [terms and conditions] but didn’t notice the part about sharing *video and images* with a third party—that would’ve given me pause,” one tester, who used the vacuum for three months last year, wrote in an email.
Before testing his Roomba, B said, he had “perused” the consent agreement and “figured it was a standard boilerplate: ‘We can do whatever the hell we want with what we collect, and if you don’t like that, don’t participate [or] use our product.’” He added, “Admittedly, I just wanted a free product.”
Still, B expected that iRobot would offer some level of data protection—not that the “company that made us swear up and down with NDAs that we wouldn’t share any information” about the tests would “basically subcontract their most intimate work to the lowest bidder.”
Notably, many of the test users who reached out—even those who say they did read the full global test agreement, as well as myriad other agreements, including ones applicable to all consumers—still say they lacked a clear understanding of what collecting their data actually meant or how exactly that data would be processed and used.
What they did understand often depended more on their own awareness of how artificial intelligence is trained than on anything communicated by iRobot.
One tester, Igor, who asked to be identified only by his first name, works in IT for a bank; he considers himself to have “above average training in cybersecurity” and has built his own internet infrastructure at home, allowing him to self-host sensitive information on his own servers and monitor network traffic. He said he did understand that videos would be taken from inside his home and that they would be tagged. “I felt that the company handled the disclosure of the data collection responsibly,” he wrote in an email, pointing to both the consent agreement and the device’s prominently placed sticker reading “video recording in process.” But, he emphasized, “I’m not an average internet user.”
For many testers, the greatest shock from our story was how the data would be handled after collection—including just how much humans would be involved. “I assumed it [the video recording] was only for internal validation if there was an issue as is common practice (I thought),” another tester who asked to be anonymous wrote in an email. And as B put it, “It definitely crossed my mind that these photos would probably be viewed for tagging within a company, but the idea that they were leaked online is disconcerting.”
“Human review didn’t surprise me,” Greg adds, but “the level of human review did … the idea, generally, is that AI should be able to improve the system 80% of the way … and the remainder of it, I think, is just on the exception … that [humans] have to look at it.”
Even the participants who were comfortable with having their images viewed and annotated, like Igor, said they were uncomfortable with how iRobot processed the data after the fact. The consent agreement, Igor wrote, “doesn’t excuse the poor data handling” and “the overall storage and control that allowed a contractor to export the data.”
Multiple US-based participants, meanwhile, expressed concerns about their data being transferred out of the country. The global agreement, they noted, had language for participants “based outside of the US” saying that “iRobot may process Research Data on servers not in my home country … including those whose laws may not offer the same level of data protection as my home country”—but the agreement did not have any corresponding information for US-based participants on how their data would be processed.
“I had no idea that the data was going overseas,” one US-based participant wrote to MIT Technology Review—a sentiment repeated by many.
Once data is collected, whether from test users or from customers, people ultimately have little to no control over what the company does with it next—including, for US users, sharing their data overseas.
US users, in fact, have few privacy protections even in their home country, notes Cahn, which is why the EU has laws to protect data from being transferred outside the EU—and to the US specifically. “Member states have to take such extensive steps to protect data being stored in that country. Whereas in the US, it’s largely the Wild West,” he says. “Americans have no equivalent protection against their data being stored in other countries.”
For some testers, this compensation was disappointing—“even before considering … my naked ass could now be on the Internet.”
Many testers themselves are aware of the broader issues around data protection in the US, which is why they chose to speak out.
“Outside of regulated industries like banking and health care, the best thing we can probably do is create significant liability for data protection failure, as only hard economic incentives will make companies focus on this,” wrote Igor, the tester who works in IT at a bank. “Sadly the political climate doesn’t seem like anything could pass here in the US. The best we have is the public shaming … but that is often only reactionary and catches just a small percentage of what’s out there.”
In the meantime, in the absence of change and accountability—whether from iRobot itself or pushed by regulators—Greg has a message for potential Roomba buyers. “I just wouldn’t buy one, flat out,” he says, because he feels “iRobot is not handling their data security model well.”
And on top of that, he warns, they’re “really dismissing their responsibility as vendors to … notify [or] protect customers—which in this case include the testers of these products.”
Lam Thuy Vo contributed research.