In the past couple of years, regulators have been caught off guard again and again as tech companies compete to launch ever more advanced AI models. It’s only a matter of time before labs release another round of models that pose new regulatory challenges. We’re likely just weeks away, for example, from OpenAI’s release of ChatGPT-5, which promises to push AI capabilities further than ever before. As it stands, it seems there’s little anyone can do to delay or prevent the release of a model that poses excessive risks.

Testing AI models before they’re released is a common approach to mitigating certain risks, and it may help regulators weigh up the costs and benefits—and potentially block models from being released if they’re deemed too dangerous. But the accuracy and comprehensiveness of these tests leaves a lot to be desired. AI models may “sandbag” the evaluation—hiding some of their capabilities to avoid raising any safety concerns. The evaluations may also fail to reliably uncover the full set of risks posed by any one model. Evaluations likewise suffer from limited scope—current tests are unlikely to uncover all the risks that warrant further investigation. There’s also the question of who conducts the evaluations and how their biases may influence testing efforts. For those reasons, evaluations need to be used alongside other governance tools. 

One such tool could be internal reporting mechanisms within the labs. Ideally, employees should feel empowered to regularly and fully share their AI safety concerns with their colleagues, and they should feel those colleagues can then be counted on to act on the concerns. However, there’s growing evidence that, far from being promoted, open criticism is becoming rarer in AI labs. Just three months ago, 13 former and current workers from OpenAI and other labs penned an open letter expressing fear of retaliation if they attempt to disclose questionable corporate behaviors that fall short of breaking the law. 

How to sound the alarm

In theory, external whistleblower protections could play a valuable role in the detection of AI risks. These could protect employees fired for disclosing corporate actions, and they could help make up for inadequate internal reporting mechanisms. Nearly every state has a public policy exception to at-will employment termination—in other words, terminated employees can seek recourse against their employers if they were retaliated against for calling out unsafe or illegal corporate practices. However, in practice this exception offers employees few assurances. Judges tend to favor employers in whistleblower cases. The likelihood of AI labs’ surviving such suits seems particularly high given that society has yet to reach any sort of consensus as to what qualifies as unsafe AI development and deployment. 

Related work from others:  Latest from MIT : This 3D printer can watch itself fabricate objects

These and other shortcomings explain why the aforementioned 13 AI workers, including ex-OpenAI employee William Saunders, called for a novel “right to warn.” Companies would have to offer employees an anonymous process for disclosing risk-related concerns to the lab’s board, a regulatory authority, and an independent third body made up of subject-matter experts. The ins and outs of this process have yet to be figured out, but it would presumably be a formal, bureaucratic mechanism. The board, regulator, and third party would all need to make a record of the disclosure. It’s likely that each body would then initiate some sort of investigation. Subsequent meetings and hearings also seem like a necessary part of the process. Yet if Saunders is to be taken at his word, what AI workers really want is something different. 

When Saunders went on the Big Technology Podcast to outline his ideal process for sharing safety concerns, his focus was not on formal avenues for reporting established risks. Instead, he indicated a desire for some intermediate, informal step. He wants a chance to receive neutral, expert feedback on whether a safety concern is substantial enough to go through a “high stakes” process such as a right-to-warn system. Current government regulators, as Saunders says, could not serve that role. 

For one thing, they likely lack the expertise to help an AI worker think through safety concerns. What’s more, few workers will pick up the phone if they know it’s a government official on the other end—that sort of call may be “very intimidating,” as Saunders himself said on the podcast. Instead, he envisages being able to call an expert to discuss his concerns. In an ideal scenario, he’d be told that the risk in question does not seem that severe or likely to materialize, freeing him up to return to whatever he was doing with more peace of mind. 

Related work from others:  Latest from Google AI - Spoken question answering and speech continuation using a spectrogram-powered LLM

Lowering the stakes

What Saunders is asking for in this podcast isn’t a right to warn, then, as that suggests the employee is already convinced there’s unsafe or illegal activity afoot. What he’s really calling for is a gut check—an opportunity to verify whether a suspicion of unsafe or illegal behavior seems warranted. The stakes would be much lower, so the regulatory response could be lighter. The third party responsible for weighing up these gut checks could be a much more informal one. For example, AI PhD students, retired AI industry workers, and other individuals with AI expertise could volunteer for an AI safety hotline. They could be tasked with quickly and expertly discussing safety matters with employees via a confidential and anonymous phone conversation. Hotline volunteers would have familiarity with leading safety practices, as well as extensive knowledge of what options, such as right-to-warn mechanisms, may be available to the employee. 

As Saunders indicated, few employees will likely want to go from 0 to 100 with their safety concerns—straight from colleagues to the board or even a government body. They are much more likely to raise their issues if an intermediary, informal step is available.

Studying examples elsewhere

The details of how precisely an AI safety hotline would work deserve more debate among AI community members, regulators, and civil society. For the hotline to realize its full potential, for instance, it may need some way to escalate the most urgent, verified reports to the appropriate authorities. How to ensure the confidentiality of hotline conversations is another matter that needs thorough investigation. How to recruit and retain volunteers is another key question. Given leading experts’ broad concern about AI risk, some may be willing to participate simply out of a desire to lend a hand. Should too few folks step forward, other incentives may be necessary. The essential first step, though, is acknowledging this missing piece in the puzzle of AI safety regulation. The next step is looking for models to emulate in building out the first AI hotline. 

Related work from others:  Latest from MIT : LLMs develop their own understanding of reality as their language abilities improve

One place to start is with ombudspersons. Other industries have recognized the value of identifying these neutral, independent individuals as resources for evaluating the seriousness of employee concerns. Ombudspersons exist in academia, nonprofits, and the private sector. The distinguishing attribute of these individuals and their staffers is neutrality—they have no incentive to favor one side or the other, and thus they’re more likely to be trusted by all. A glance at the use of ombudspersons in the federal government shows that when they are available, issues may be raised and resolved sooner than they would be otherwise.

This concept is relatively new. The US Department of Commerce established the first federal ombudsman in 1971. The office was tasked with helping citizens resolve disputes with the agency and investigate agency actions. Other agencies, including the Social Security Administration and the Internal Revenue Service, soon followed suit. A retrospective review of these early efforts concluded that effective ombudspersons can meaningfully improve citizen-government relations. On the whole, ombudspersons were associated with an uptick in voluntary compliance with regulations and cooperation with the government. 

An AI ombudsperson or safety hotline would surely have different tasks and staff from an ombudsperson in a federal agency. Nevertheless, the general concept is worthy of study by those advocating safeguards in the AI industry. 

A right to warn may play a role in getting AI safety concerns aired, but we need to set up more intermediate, informal steps as well. An AI safety hotline is low-hanging regulatory fruit. A pilot made up of volunteers could be organized in relatively short order and provide an immediate outlet for those, like Saunders, who merely want a sounding board.

Kevin Frazier is an assistant professor at St. Thomas University College of Law and senior research fellow in the Constitutional Studies Program at the University of Texas at Austin.

Share via
Copy link
Powered by Social Snap