Popular AI image-generating systems notoriously tend to amplify harmful biases and stereotypes. But just how big a problem is it? You can now see for yourself using interactive new online tools. (Spoiler alert: it’s big.)
The tools, built by researchers at AI startup Hugging Face and Leipzig University and detailed in a non-peer-reviewed paper, allow people to examine biases in three popular AI image-generating models: DALL-E 2 and the two recent versions of Stable Diffusion.
To create the tools, the researchers first used the three AI image models to generate 96,000 images of people of different ethnicities, genders, and professions. The team asked the models to generate one set of images based on social attributes, such as “a woman” or “a Latinx man,” and then another set of images relating to professions and adjectives, such as “an ambitious plumber” or “a compassionate CEO.”
The researchers wanted to examine how the two sets of images varied. They did this by applying a machine-learning technique called clustering to the pictures. This technique tries to find patterns in the images without assigning categories, such as gender or ethnicity, to them. This allowed the researchers to analyze the similarities between different images to see what subjects the model groups together, such as people in positions of power. They then built interactive tools that allow anyone to explore the images these AI models produce and any biases reflected in that output. These tools are freely available on Hugging Face’s website.
After analyzing the images generated by DALL-E 2 and Stable Diffusion, they found that the models tended to produce images of people that look white and male, especially when asked to depict people in positions of authority. That was particularly true for DALL-E 2, which generated white men 97% of the time when given prompts like “CEO” or “director.” That’s because these models are trained on enormous amounts of data and images scraped from the internet, a process that not only reflects but further amplifies stereotypes around race and gender.
But these tools mean people don’t have to just believe what Hugging Face says: they can see the biases at work for themselves. For example, one tool allows you to explore the AI-generated images of different groups, such as Black women, to see how closely they statistically match Black women’s representation in different professions. Another can be used to analyze AI-generated faces of people in a particular profession and combine them into an average representation of images for that job.
The average face of a teacher generated by Stable Diffusion and DALL-E 2.
Still another tool lets people see how attaching different adjectives to a prompt changes the images the AI model spits out. Here the models’ output overwhelmingly reflected stereotypical gender biases. Adding adjectives such as “compassionate,” “emotional,” or “sensitive” to a prompt describing a profession will more often make the AI model generate a woman instead of a man. In contrast, specifying the adjectives “stubborn,” “intellectual,” or “unreasonable” will in most cases lead to images of men.
There’s also a tool that lets people see how the AI models represent different ethnicities and genders. For example, when given the prompt “Native American,” both DALL-E 2 and Stable Diffusion generate images of people wearing traditional headdresses.
“In almost all of the representations of Native Americans, they were wearing traditional headdresses, which obviously isn’t the case in real life,” says Sasha Luccioni, the AI researcher at Hugging Face who led the work.
Surprisingly, the tools found that image-making AI systems tend to depict white nonbinary people as almost identical to each other but produce more variations in the way they depict nonbinary people of other ethnicities, says Yacine Jernite, an AI researcher at Hugging Face who worked on the project.
One theory as to why that might be is that nonbinary brown people may have had more visibility in the press recently, meaning their images end up in the data sets the AI models use for training, says Jernite.
OpenAI and Stability.AI, the company that built Stable Diffusion, say that they have introduced fixes to mitigate the biases ingrained in their systems, such as blocking certain prompts that seem likely to generate offensive images. However, these new tools from Hugging Face show how limited these fixes are.
A spokesperson for Stability.AI told us that the company trains its models on “data sets specific to different countries and cultures,” adding that this should “serve to mitigate biases caused by overrepresentation in general data sets.”
A spokesperson for OpenAI did not comment on the tools specifically, but pointed us to a blog post explaining how the company has added various techniques to DALL-E 2 to filter out bias and sexual and violent images.
Bias is becoming a more urgent problem as these AI models become more widely adopted and produce ever more realistic images. They are already being rolled out in a slew of products, such as stock photos. Luccioni says she is worried that the models risk reinforcing harmful biases on a large scale. She hopes the tools she and her team have created will bring more transparency to image-generating AI systems and underscore the importance of making them less biased.
Part of the problem is that these models are trained on predominantly US-centric data, which means they mostly reflect American associations, biases, values, and culture, says Aylin Caliskan, an associate professor at the University of Washington who studies bias in AI systems and was not involved in this research.
“What ends up happening is the thumbprint of this online American culture … that’s perpetuated across the world,” Caliskan says.
Caliskan says Hugging Face’s tools will help AI developers better understand and reduce biases in their AI models. “When people see these examples directly, I believe they’ll be able to understand the significance of these biases better,” she says.