Reaching a consensus in a democracy is difficult because people hold such different ideological, political, and social views. 

Perhaps an AI tool could help. Researchers from Google DeepMind trained a system of large language models (LLMs) to operate as a “caucus mediator,” generating summaries that outline a group’s areas of agreement on complex but important social or political issues.

The researchers say the tool—named the Habermas machine (HM), after the German philosopher Jürgen Habermas—highlights the potential of AI to help groups of people find common ground when discussing such subjects.

“The large language model was trained to identify and present areas of overlap between the ideas held among group members,” says Michael Henry Tessler, a research scientist at Google DeepMind. “It was not trained to be persuasive but to act as a mediator.” The study is being published today in the journal Science.

Google DeepMind recruited 5,734 participants, some through a crowdsourcing research platform and others through the Sortition Foundation, a nonprofit that organizes citizens’ assemblies. The Sortition groups formed a demographically representative sample of the UK population.

The HM consists of two different LLMs fine-tuned for this task. The first is a generative model, and it suggests statements that reflect the varied views of the group. The second is a personalized reward model, which scores the proposed statements by how much it thinks each participant will agree with them.

The researchers split the participants into groups and tested the HM in two steps: first by seeing if it could accurately summarize collective opinions and then by checking if it could also mediate between different groups and help them find common ground. 

Related work from others:  O'Reilly Media - On Technique

To start, they posed questions such as “Should we lower the voting age to 16?” or “Should the National Health Service be privatized?” The participants submitted responses to the HM before discussing their views within groups of around five people. 

The HM summarized the group’s opinions; then these summaries were sent to individuals to critique. At the end the HM produced a final set of statements, and participants ranked them. 

The researchers then set out to test whether the HM could act as a useful AI mediation tool. 

Participants were divided up into six-person groups, with one participant in each randomly assigned to write statements on behalf of the group. This person was designated the “mediator.” In each round of deliberation, participants were presented with one statement from the human mediator and one AI-generated statement from the HM and asked which they preferred. 

More than half (56%) of the time, the participants chose the AI statement. They found these statements to be of higher quality than those produced by the human mediator and tended to endorse them more strongly. After deliberating with the help of the AI mediator, the small groups of participants were less divided in their positions on the issues. 

Although the research demonstrates that AI systems are good at generating summaries reflecting group opinions, it’s important to be aware that their usefulness has limits, says Joongi Shin, a researcher at Aalto University who studies generative AI. 

“Unless the situation or the context is very clearly open, so they can see the information that was inputted into the system and not just the summaries it produces, I think these kinds of systems could cause ethical issues,” he says. 

Related work from others:  O'Reilly Media - The ChatGPT Surge

Google DeepMind did not explicitly tell participants in the human mediator experiment that an AI system would be generating group opinion statements, although it indicated on the consent form that algorithms would be involved. 

 “It’s also important to acknowledge that the model, in its current form, is limited in its capacity to handle certain aspects of real-world deliberation,” Tessler says. “For example, it doesn’t have the mediation-relevant capacities of fact-checking, staying on topic, or moderating the discourse.” 

Figuring out where and how this kind of technology could be used in the future would require further research to ensure responsible and safe deployment. The company says it has no plans to launch the model publicly.

Share via
Copy link
Powered by Social Snap