It’s becoming increasingly clear that courts, not politicians, will be the first to determine the limits on how AI is developed and used in the US.

Last week, the Federal Trade Commission opened an investigation into whether OpenAI violated consumer protection laws by scraping people’s online data to train its popular AI chatbot ChatGPT. Meanwhile, artists, authors, and the image company Getty are suing AI companies such as OpenAI, Stability AI, and Meta, alleging that they broke copyright laws by training their models on their work without providing any recognition or payment.

If these cases prove successful, they could force OpenAI, Meta, Microsoft, and others to change the way AI is built, trained, and deployed so that it is more fair and equitable. 

They could also create new ways for artists, authors, and others to be compensated for having their work used as training data for AI models, through a system of licensing and royalties. 

The generative AI boom has revived American politicians’ enthusiasm for passing AI-specific laws. However, we’re unlikely to see any such legislation pass in the next year, given the split Congress and intense lobbying from tech companies, says Ben Winters, senior counsel at the Electronic Privacy Information Center. Even the most prominent attempt to create new AI rules, Senator Chuck Schumer’s SAFE Innovation framework, does not include any specific policy proposals. 

“It seems like the more straightforward path [toward an AI rulebook is] to start with the existing laws on the books,” says Sarah Myers West, the managing director of the AI Now Institute, a research group.

And that means lawsuits.

Lawsuits left, right, and center 

Existing laws have provided plenty of ammunition for those who say their rights have been harmed by AI companies. 

In the past year, those companies have been hit by a wave of lawsuits, most recently from the comedian and author Sarah Silverman, who claims that OpenAI and Meta scraped her copyrighted material illegally off the internet to train their models. Her claims are similar to those of artists in another class action alleging that popular image-generation AI software used their copyrighted images without consent. Microsoft, OpenAI, and GitHub’s AI-assisted programming tool Copilot are also facing a class action claiming that it relies on “software piracy on an unprecedented scale” because it’s trained on existing programming code scraped from websites.   

Related work from others:  Latest from MIT : AI in health should be regulated, but don’t forget about the algorithms, researchers say

Meanwhile, the FTC is investigating whether OpenAI’s data security and privacy practices are unfair and deceptive, and whether the company caused harm, including reputational harm, to consumers when it trained its AI models. It has real evidence to back up its concerns: OpenAI had a security breach earlier this year after a bug in the system caused users’ chat history and payment information to be leaked. And AI language models often spew inaccurate and made-up content, sometimes about people. 

OpenAI is bullish about the FTC investigation—at least in public. When contacted for comment, the company shared a Twitter thread from CEO Sam Altman in which he said the company is “confident we follow the law.”

An agency like the FTC can take companies to court, enforce standards against the industry, and introduce better business practices, says Marc Rotenberg, the president and founder of the Center for AI and Digital Policy (CAIDP), a nonprofit. CAIDP filed a complaint to the FTC in March asking it to investigate OpenAI. The agency has the power to effectively create new guardrails that tell AI companies what they are and aren’t allowed to do, says Myers West. 

The FTC could require OpenAI to pay fines or delete any data that has been illegally obtained, and to delete the algorithms that used the illegally collected data, Rotenberg says. In the most extreme case, ChatGPT could be taken offline. There is precedent for this: the agency made the diet company Weight Watchers delete its data and algorithms in 2022 after illegally collecting children’s data. 

Other government enforcement agencies may very well start their own investigations too. The Consumer Financial Protection Bureau has signaled it is looking into the use of AI chatbots in banking, for example. And if generative AI plays a decisive role in the upcoming 2024 US presidential election, the Federal Election Commission could also investigate, says Winters.   

In the meantime, we should start to see the results of lawsuits trickle in, although it could take at least a couple of years before the class actions and the FTC investigation go to court. 

Many of the lawsuits that have been filed this year will be dismissed by a judge as being too broad, reckons Mehtab Khan, a resident fellow at Yale Law School, who specializes in intellectual property, data governance, and AI ethics. But they still serve an important purpose. Lawyers are casting a wide net and seeing what sticks. This allows for more precise court cases that could lead companies to change the way they build and use their AI models down the line, she adds. 

Related work from others:  Latest from MIT : Technique improves the reasoning capabilities of large language models

The lawsuits could also force companies to improve their data documentation practices, says Khan. At the moment, tech companies have a very rudimentary idea of what data goes into their AI models. Better documentation of how they have collected and used data might expose any illegal practices, but it might also help them defend themselves in court.

History repeats itself 

It’s not unusual for lawsuits to yield results before other forms of regulation kick in—in fact, that’s exactly how the US has handled new technologies in the past, says Khan. 

Its approach differs from that of other Western countries. While the EU is trying to prevent the worst AI harms proactively, the American approach is more reactive. The US waits for harms to emerge first before regulating, says Amir Ghavi, a partner at the law firm Fried Frank. Ghavi is representing Stability AI, the company behind the open-source image-generating AI Stable Diffusion, in three copyright lawsuits. 

“That’s a pro-capitalist stance,” Ghavi says. “It fosters innovation. It gives creators and inventors the freedom to be a bit more bold in imagining new solutions.” 

The class action lawsuits over copyright and privacy could shed more light on how “black box” AI algorithms work and create new ways for artists and authors to be compensated for having their work used in AI models, say Joseph Saveri, the founder of an antitrust and class action law firm, and Matthew Butterick, a lawyer. 

They are leading the suits against GitHub and Microsoft, OpenAI, Stability AI, and Meta. Saveri and Butterick represent Silverman, part of a group of authors who claim that the tech companies trained their language models on their copyrighted books. Generative AI models are trained using vast data sets of images and text scraped from the internet. This inevitably includes copyrighted data. Authors, artists, and programmers say tech companies that have scraped their intellectual property without consent or attribution should compensate them. 

Related work from others:  Latest from MIT : When it comes to AI, can we ditch the datasets?

“There’s a void where there’s no rule of law yet, and we’re bringing the law where it needs to go,” says Butterick. While the AI technologies at issue in the suits may be new, the legal questions around them are not, and the team is relying on “good old fashioned” copyright law, he adds. 

Butterick and Saveri point to Napster, the peer-to-peer music sharing system, as an example. The company was sued by record companies for copyright infringement, and it led to a landmark case on the fair use of music. 

The Napster settlement cleared the way for companies like Apple, Spotify, and others to start creating new license-based deals, says Butterick. The pair is hoping their lawsuits, too, will clear the way for a licensing solution where artists, writers, and other copyright holders could also be paid royalties for having their content used in an AI model, similar to the system in place in the music industry for sampling songs. Companies would also have to ask for explicit permission to use copyrighted content in training sets. 

Tech companies have treated publicly available copyrighted data on the internet as subject to “fair use” under US copyright law, which would allow them to use it without asking for permission first. Copyright holders disagree. The class actions will likely determine who is right, says Ghavi. 

This is just the beginning of a new boom time for tech lawyers. The experts MIT Technology Review spoke to agreed that tech companies are also likely to face litigation over privacy and biometric data, such as images of people’s faces or clips of them speaking. Prisma Labs, the company behind the popular AI avatar program Lensa, is already facing a class action lawsuit over the way it’s collected users’ biometric data. 

Ben Winters believes we will also see more lawsuits around product liability and Section 230, which would determine whether AI companies are responsible if their products go awry and whether they should be liable for the content their AI models produce.

“The litigation processes can be a blunt object for social change but, nonetheless, can be quite effective,” says Saveri. “And no one’s lobbying Matthew [Butterick] or me.” 

Similar Posts