Artificial Intelligence

Latest from MIT Tech Review – AI image generator Midjourney blocks porn by banning words about the human reproductive system

The popular AI image generator Midjourney bans a wide range of words about the human reproductive system from being used as prompts, MIT Technology Review has discovered.

If someone types “placenta,” “fallopian tubes,” “mammary glands,” “sperm,” “uterine,” “urethra,” “cervix,” “hymen,” or “vulva” into Midjourney, the system flags the word as a banned prompt and doesn’t let it be used. Sometimes, users who tried one of these prompts are blocked for a limited time for trying to generate banned content. Other words relating to human biology, such as “liver” and “kidney,” are allowed.

Midjourney’s founder, David Holz, says it’s banning these words as a stopgap measure to prevent people from generating shocking or gory content while the company “improves things on the AI side.” Holz says moderators watch how words are being used and what kinds of images are being generated, and adjust the bans periodically. The firm has a community guidelines page that lists the type of content it blocks in this way, including sexual imagery, gore and even the emoji, which is often used as a symbol for the buttocks.

AI models such as Midjourney, DALL-E 2, and Stable Diffusion are trained on billions of images that have been scraped from the internet. Research by a team at the University of Washington has found that such models learn biases that sexually objectify women, which are then reflected in the images they produce. The massive size of the data set makes it almost impossible to remove unwanted images, such as those of a sexual or violent nature, or those that could produce biased outcomes. The more often something appears in the data set, the stronger the connection the AI model makes, which means it is more likely to appear in images the model generates.

Related work from others: Latest from MIT Tech Review - To avoid AI doom, learn from nuclear safety

Midjourney’s word bans are a piecemeal attempt to address this problem. Some terms relating to the male reproductive system, such as “sperm” and “testicles,” are blocked too, but the list of banned words seems to skew predominantly female.

The prompt ban was first spotted by Julia Rockwell, a clinical data analyst at Datafy Clinical, and her friend Madeline Keenen, a cell biologist at the University of North Carolina at Chapel Hill. Rockwell used Midjourney to try to generate a fun image of the placenta for Keenen, who studies them. To her surprise, Rockwell found that using “placenta” as a prompt was banned. She then started experimenting with other words related to the human reproductive system, and found the same.

However, the pair also showed how its possible to work around these bans to create sexualized images by using different spellings of words, or other euphemisms for sexual or gory content.

In findings they shared with MIT Technology Review, they found that the prompt “gynaecological exam”—using the British spelling—generated some deeply creepy images: one of two naked women in a doctor’s office, and another of a bald three-limbed person cutting up their own stomach.

An image generated in Midjourney using the prompt “gynaecology exam.”

JULIA ROCKWELL

Midjourney’s crude banning of prompts relating to reproductive biology highlights how tricky it is to moderate content around generative AI systems. It also demonstrates how the tendency for AI systems to sexualize women extends all the way to their internal organs, says Rockwell.

It doesn’t have to be like this. OpenAI and Stability.AI have managed to filter out unwanted outputs and prompts, so when you type the same words into their image-making systems—DALL-E 2 and Stable Diffusion, respectively—they produce very different images. The prompt “gynecology exam” yielded images of a person holding an invented medical device for DALL-E 2, and two distorted masked women with rubber gloves and lab coats on Stable Diffusion. Both systems also allowed the prompt “placenta,” and produced biologically inaccurate images of fleshy organs in response.

Related work from others: Latest from MIT : New hardware offers faster computation for artificial intelligence, with much less energy

A spokesperson for Stability.AI said their latest model has a filter that blocks unsafe and inappropriate content from users, and has a tool that detects nudity and other inappropriate images and returns a blurred image. The company uses a combination of keywords, image recognition and other techniques to moderate the images its AI system generates. OpenAI did not respond to a request for comment.

An image generated with DALL-E 2 using the prompt “gynecology exam.”

An image generated by Stable Diffusion with the prompt “gynecology exam.”

But tools to filter out unwanted AI-generated images are still deeply imperfect. Because AI developers and researchers don’t know how to systemically audit and improve their models yet, they “hotfix” them with blanket bans like the ones Midjourney has introduced, says Marzyeh Ghassemi, an assistant professor at MIT who studies applying machine learning to health.

It’s unclear why references to gynecological exams or the placenta, an organ that develops during pregnancy and provides oxygen and nutrients to a baby, would generate gory or sexually explicit content. But it likely has something to do with the associations the model has made between images in its data set, according to Irene Chen, a researcher at Microsoft Research, who studies machine learning for equitable health care.

“Much more work needs to be done to understand what harmful associations models might be learning, because if we work with human data, we are going to learn biases,” says Ghassemi.

There are many approaches tech companies could take to address this issue besides banning words altogether. For example, Ghassemi says, certain prompts—such as ones relating to human biology—could be allowed in particular contexts but banned in others.

Related work from others: O'Reilly Media - Preparing for AI

“Placenta” could be allowed if the string of words in the prompt signaled that the user was trying to generate an image of the organ for educational or research purposes. But if the prompt was used in a context where someone tried to generate sexual content or gore, it could be banned.

However crude, though, Midjourney’s censoring has been done with the right intentions.

“These guardrails are there to protect women and minorities from having disturbing content generated about them and used against them,” says Ghassemi.

Artificial Intelligence

Latest from Google AI – Learning the importance of training data under concept drift

Posted by Nishant Jain, Pre-doctoral Researcher, and Pradeep Shenoy, Research Scientist, Google Research The constantly changing nature of the world around us poses a significant challenge for the development of AI models. Often, models are trained on longitudinal data with the hope that the training data used will accurately represent inputs the model may receive…

Artificial Intelligence

Latest from Google AI – The Balloon Learning Environment

Posted by Joshua Greaves, Software Engineer and Pablo Samuel Castro, Staff Software Engineer, Google Research, Brain Team Benchmark challenges have been a driving force in the advancement of machine learning (ML). In particular, difficult benchmark environments for reinforcement learning (RL) have been crucial for the rapid progress of the field by challenging researchers to overcome…

Artificial Intelligence

Latest from MIT : Enhancing LLM collaboration for smarter, more efficient solutions

Ever been asked a question you only knew part of the answer to? To give a more informed response, your best move would be to phone a friend with more knowledge on the subject. This collaborative process can also help large language models (LLMs) improve their accuracy. Still, it’s been difficult to teach LLMs to…

Artificial Intelligence

Latest from MIT Tech Review – OpenAI can rehabilitate AI models that develop a “bad boy persona”

A new paper from OpenAI released today has shown why a little bit of bad training can make AI models go rogue but also demonstrates that this problem is generally pretty easy to fix. Back in February, a group of researchers discovered that fine-tuning an AI model (in their case, OpenAI’s GPT-4o) by training it…

Artificial Intelligence

Latest from MIT Tech Review – AI is dreaming up drugs that no one has ever seen. Now we’ve got to see if they work.

At 82 years old, with an aggressive form of blood cancer that six courses of chemotherapy had failed to eliminate, “Paul” appeared to be out of options. With each long and unpleasant round of treatment, his doctors had been working their way down a list of common cancer drugs, hoping to hit on something that…

Artificial Intelligence

Latest from MIT Tech Review – I Was There When: Facebook put profits over safety

Last month, the primary source for the Wall Street Journal’s Facebook Files, revealed her identity in an episode of 60 Minutes. Frances Haugen, a former product manager at the company, says she came forward after she saw Facebook’s leadership repeatedly prioritize profit over safety. She then appeared before lawmakers in the US and the UK to talk…