Artificial Intelligence

Latest from MIT Tech Review – AI image generator Midjourney blocks porn by banning words about the human reproductive system

The popular AI image generator Midjourney bans a wide range of words about the human reproductive system from being used as prompts, MIT Technology Review has discovered.

If someone types “placenta,” “fallopian tubes,” “mammary glands,” “sperm,” “uterine,” “urethra,” “cervix,” “hymen,” or “vulva” into Midjourney, the system flags the word as a banned prompt and doesn’t let it be used. Sometimes, users who tried one of these prompts are blocked for a limited time for trying to generate banned content. Other words relating to human biology, such as “liver” and “kidney,” are allowed.

Midjourney’s founder, David Holz, says it’s banning these words as a stopgap measure to prevent people from generating shocking or gory content while the company “improves things on the AI side.” Holz says moderators watch how words are being used and what kinds of images are being generated, and adjust the bans periodically. The firm has a community guidelines page that lists the type of content it blocks in this way, including sexual imagery, gore and even the emoji, which is often used as a symbol for the buttocks.

AI models such as Midjourney, DALL-E 2, and Stable Diffusion are trained on billions of images that have been scraped from the internet. Research by a team at the University of Washington has found that such models learn biases that sexually objectify women, which are then reflected in the images they produce. The massive size of the data set makes it almost impossible to remove unwanted images, such as those of a sexual or violent nature, or those that could produce biased outcomes. The more often something appears in the data set, the stronger the connection the AI model makes, which means it is more likely to appear in images the model generates.

Related work from others: Latest from Google AI - Mapping Urban Trees Across North America with the Auto Arborist Dataset

Midjourney’s word bans are a piecemeal attempt to address this problem. Some terms relating to the male reproductive system, such as “sperm” and “testicles,” are blocked too, but the list of banned words seems to skew predominantly female.

The prompt ban was first spotted by Julia Rockwell, a clinical data analyst at Datafy Clinical, and her friend Madeline Keenen, a cell biologist at the University of North Carolina at Chapel Hill. Rockwell used Midjourney to try to generate a fun image of the placenta for Keenen, who studies them. To her surprise, Rockwell found that using “placenta” as a prompt was banned. She then started experimenting with other words related to the human reproductive system, and found the same.

However, the pair also showed how its possible to work around these bans to create sexualized images by using different spellings of words, or other euphemisms for sexual or gory content.

In findings they shared with MIT Technology Review, they found that the prompt “gynaecological exam”—using the British spelling—generated some deeply creepy images: one of two naked women in a doctor’s office, and another of a bald three-limbed person cutting up their own stomach.

An image generated in Midjourney using the prompt “gynaecology exam.”

JULIA ROCKWELL

Midjourney’s crude banning of prompts relating to reproductive biology highlights how tricky it is to moderate content around generative AI systems. It also demonstrates how the tendency for AI systems to sexualize women extends all the way to their internal organs, says Rockwell.

It doesn’t have to be like this. OpenAI and Stability.AI have managed to filter out unwanted outputs and prompts, so when you type the same words into their image-making systems—DALL-E 2 and Stable Diffusion, respectively—they produce very different images. The prompt “gynecology exam” yielded images of a person holding an invented medical device for DALL-E 2, and two distorted masked women with rubber gloves and lab coats on Stable Diffusion. Both systems also allowed the prompt “placenta,” and produced biologically inaccurate images of fleshy organs in response.

Related work from others: Latest from Google AI - Lidar-Camera Deep Fusion for Multi-Modal 3D Detection

A spokesperson for Stability.AI said their latest model has a filter that blocks unsafe and inappropriate content from users, and has a tool that detects nudity and other inappropriate images and returns a blurred image. The company uses a combination of keywords, image recognition and other techniques to moderate the images its AI system generates. OpenAI did not respond to a request for comment.

An image generated with DALL-E 2 using the prompt “gynecology exam.”

An image generated by Stable Diffusion with the prompt “gynecology exam.”

But tools to filter out unwanted AI-generated images are still deeply imperfect. Because AI developers and researchers don’t know how to systemically audit and improve their models yet, they “hotfix” them with blanket bans like the ones Midjourney has introduced, says Marzyeh Ghassemi, an assistant professor at MIT who studies applying machine learning to health.

It’s unclear why references to gynecological exams or the placenta, an organ that develops during pregnancy and provides oxygen and nutrients to a baby, would generate gory or sexually explicit content. But it likely has something to do with the associations the model has made between images in its data set, according to Irene Chen, a researcher at Microsoft Research, who studies machine learning for equitable health care.

“Much more work needs to be done to understand what harmful associations models might be learning, because if we work with human data, we are going to learn biases,” says Ghassemi.

There are many approaches tech companies could take to address this issue besides banning words altogether. For example, Ghassemi says, certain prompts—such as ones relating to human biology—could be allowed in particular contexts but banned in others.

Related work from others: Latest from MIT : Computer vision system marries image recognition and generation

“Placenta” could be allowed if the string of words in the prompt signaled that the user was trying to generate an image of the organ for educational or research purposes. But if the prompt was used in a context where someone tried to generate sexual content or gore, it could be banned.

However crude, though, Midjourney’s censoring has been done with the right intentions.

“These guardrails are there to protect women and minorities from having disturbing content generated about them and used against them,” says Ghassemi.

Artificial Intelligence

Latest from MIT Tech Review – How US AI policy might change under Trump

This story is from The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here. President Biden first witnessed the capabilities of ChatGPT in 2022 during a demo from Arati Prabhakar, the Director of the White House Office of Science and Technology Policy, in the oval office. That demo set…

Artificial Intelligence

UC Berkeley – Ghostbuster: Detecting Text Ghostwritten by Large Language Models

The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual…

Artificial Intelligence

Latest from MIT : Dexterous robotic hands manipulate thousands of objects with ease

At just one year old, a baby is more dexterous than a robot. Sure, machines can do more than just pick up and put down objects, but we’re not quite there as far as replicating a natural pull toward exploratory or sophisticated dexterous manipulation goes. Artificial intelligence firm OpenAI gave it a try with Dactyl…

Artificial Intelligence

UC Berkeley – Which Mutual Information Representation Learning Objectives are Sufficient for Control?

Processing raw sensory inputs is crucial for applying deep RL algorithms to real-world problems. For example, autonomous vehicles must make decisions about how to drive safely given information flowing from cameras, radar, and microphones about the conditions of the road, traffic signals, and other cars and pedestrians. However, direct “end-to-end” RL that maps sensor data…

Artificial Intelligence

Latest from MIT Tech Review – Getty Images promises its new AI contains no copyrighted art

Getty Images is so confident its new generative AI model is free of copyrighted content that it will cover any potential intellectual-property disputes for its customers. The generative AI system, announced today, was built by Nvidia and is trained solely on images in Getty’s image library. It does not include logos or images that have…

Artificial Intelligence

Latest from MIT Tech Review – DeepMind has predicted the structure of almost every protein known to science

DeepMind says its AlphaFold tool has successfully predicted the structure of nearly all proteins known to science. From today, the Alphabet-owned AI lab is offering its database of over 200 million proteins to anyone for free. When DeepMind introduced AlphaFold in 2020, it took the science community by surprise. Scientists had spent decades trying to…