O’Reilly Media – Preliminary Thoughts on the White House Executive Order on AI

Disclaimer: Based on the announcement of the EO, without having seen the full text.

While I am heartened to hear that the Executive Order on AI uses the Defense Production Act to compel disclosure of various data from the development of large AI models, these disclosures do not go far enough. The EO seems to be requiring only data on the procedures and results of “Red Teaming” (i.e. adversarial testing to determine a model’s flaws and weak points), and not a wider range of information that would help to address many of the other concerns outlined in the EO. These include:

What data sources the model is trained on. Availability of this information would assist in many of the other goals outlined in the EO, including addressing algorithmic discrimination and increasing competition in the AI market, as well as other important issues that the EO does not address, such as copyright. The recent discovery (documented by an exposé in The Atlantic) that OpenAI, Meta, and others used databases of pirated books, for example, highlights the need for transparency in training data. Given the importance of intellectual property to the modern economy, copyright ought to be an important part of this executive order. Transparency on this issue will not only allow for debate and discussion of the intellectual property issues raised by AI, it will increase competition between developers of AI models to license high-quality data sources and to differentiate their models based on that quality. To take one example, would we be better off with the medical or legal advice from an AI that was trained only with the hodgepodge of knowledge to be found on the internet, or one trained on the full body of professional information on the topic?Operational Metrics. Like other internet-available services, AI models are not static artifacts, but dynamic systems that interact with their users. AI companies deploying these models manage and control them by measuring and responding to various factors, such as permitted, restricted, and forbidden uses; restricted and forbidden users; methods by which its policies are enforced; detection of machine-generated content, prompt-injection, and other cyber-security risks; usage by geography, and if measured, by demographics and psychographics; new risks and vulnerabilities identified during operation that go beyond those detected in the training phase; and much more. These should not be a random grab-bag of measures thought up by outside regulators or advocates, but disclosures of the actual measurements and methods that the companies use to manage their AI systems.Policy on use of user data for further training. AI companies typically treat input from their users as additional data available for training. This has both privacy and intellectual property implications.Procedures by which the AI provider will respond to user feedback and complaints. This should include its proposed redress mechanisms.Methods by which the AI provider manages and mitigates risks identified via Red Teaming, including their effectiveness. This reporting should not just be “once and done,” but an ongoing process that allows the researchers, regulators, and the public to understand whether the models are improving or declining in their ability to manage the identified new risks.Energy usage and other environmental impacts.

O’Reilly Media – The Quality of Auto-Generated Code

Kevlin Henney and I were riffing on some ideas about GitHub Copilot, the tool for automatically generating code base on GPT-3’s language model, trained on the body of code that’s in GitHub. This article poses some questions and (perhaps) some answers, without trying to present any conclusions. First, we wondered about code quality. There are…

Artificial Intelligence

Latest from MIT Tech Review – Here’s how people are actually using AI

This story is from The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here. When the generative AI boom started with ChatGPT in late 2022, we were sold a vision of superintelligent AI tools that know everything, can replace the boring bits of work, and supercharge productivity and…

Artificial Intelligence

Latest from MIT Tech Review – Here’s our forecast for AI this year

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here. In December, our small but mighty AI reporting team was asked by our editors to make a prediction: What’s coming next for AI? In 2024, AI contributed both to Nobel Prize–winning…

Artificial Intelligence

Latest from MIT : Model predicts long-term effects of nuclear waste on underground disposal systems

As countries across the world experience a resurgence in nuclear energy projects, the questions of where and how to dispose of nuclear waste remain as politically fraught as ever. The United States, for instance, has indefinitely stalled its only long-term underground nuclear waste repository. Scientists are using both modeling and experimental methods to study the…

Artificial Intelligence

Latest from MIT Tech Review – AI could help people find common ground during deliberations

Reaching a consensus in a democracy is difficult because people hold such different ideological, political, and social views. Perhaps an AI tool could help. Researchers from Google DeepMind trained a system of large language models (LLMs) to operate as a “caucus mediator,” generating summaries that outline a group’s areas of agreement on complex but important…

Artificial Intelligence

Latest from MIT : 3 Questions: Inverting the problem of design

The process of computational design in mechanical engineering often begins with a problem or a goal, followed by an assessment of literature, resources, and systems available to address the issue. The Design Computation and Digital Engineering (DeCoDE) Lab at MIT instead explores the bounds of what is possible. Working with the MIT-IBM Watson AI Lab,…

Similar Posts