O’Reilly Media – Generative AI in the Real World: Danielle Belgrave on Generative AI in Pharma and Medicine

Join Danielle Belgrave and Ben Lorica for a discussion of AI in healthcare. Danielle is VP of AI and machine learning at GSK (formerly GlaxoSmithKline). She and Ben discuss using AI and machine learning to get better diagnoses that reflect the differences between patients. Listen in to learn about the challenges of working with health data—a field where there’s both too much data and too little, and where hallucinations have serious consequences. And if you’re excited about healthcare, you’ll also find out how AI developers can get into the field.

Check out other episodes of this podcast on the O’Reilly learning platform.

About the Generative AI in the Real World podcast: In 2023, ChatGPT put AI on everyone’s agenda. In 2025, the challenge will be turning those agendas into reality. In Generative AI in the Real World, Ben Lorica interviews leaders who are building with AI. Learn from their experience to help put AI to work in your enterprise.

Points of Interest

0:00: Introduction to Danielle Belgrave, VP of AI and machine learning at GSK. Danielle is our first guest representing Big Pharma. It will be interesting to see how people in pharma are using AI technologies.

0:49: My interest in machine learning for healthcare began 15 years ago. My PhD was on understanding patient heterogeneity in asthma-related disease. This was before electronic healthcare records. By leveraging different kinds of data, genomics data and biomarkers from children, and seeing how they developed asthma and allergic diseases, I developed causal modeling frameworks and graphical models to see if we could identify who would respond to what treatments. This was quite novel at the time. We identified five different types of asthma. If we can understand heterogeneity in asthma, a bigger challenge is understanding heterogeneity in mental health. The idea was trying to understand heterogeneity over time in patients with anxiety.

4:12: When I went to DeepMind, I worked on the healthcare portfolio. I became very curious about how to understand things like MIMIC, which had electronic healthcare records, and image data. The idea was to leverage tools like active learning to minimize the amount of data you take from patients. We also published work on improving the diversity of datasets.

5:19: When I came to GSK, it was an exciting opportunity to do both tech and health. Health is one of the most challenging landscapes we can work on. Human biology is very complicated. There is so much random variation. To understand biology, genomics, disease progression, and have an impact on how drugs are given to patients is amazing.

6:15: My role is leading AI/ML for clinical development. How can we understand heterogeneity in patients to optimize clinical trial recruitment and make sure the right patients have the right treatment?

6:56: Where does AI create the most value across GSK today? That can be both traditional AI and generative AI.

7:23: I use everything interchangeably, though there are distinctions. The real important thing is focusing on the problem we are trying to solve, and focusing on the data. How do we generate data that’s meaningful? How do we think about deployment?

8:07: And all the Q&A and red teaming.

8:20: It’s hard to put my finger on what’s the most impactful use case. When I think of the problems I care about, I think about oncology, pulmonary disease, hepatitis—these are all very impactful problems, and they’re problems that we actively work on. If I were to highlight one thing, it’s the interplay between when we are looking at whole genome sequencing data and looking at molecular data and trying to translate that into computational pathology. By looking at those data types and understanding heterogeneity at that level, we get a deeper biological representation of different subgroups and understand mechanisms of action for response to drugs.

9:35: It’s not scalable doing that for individuals, so I’m interested in how we translate across different types or modalities of data. Taking a biopsy—that’s where we’re entering the field of artificial intelligence. How do we translate between genomics and looking at a tissue sample?

10:25: If we think of the impact of the clinical pipeline, the second example would be using generative AI to discover drugs, target identification. Those are often in silico experiments. We have perturbation models. Can we perturb the cells? Can we create embeddings that will give us representations of patient response?

11:13: We’re generating data at scale. We want to identify targets more quickly for experimentation by ranking probability of success.

11:36: You’ve mentioned multimodality a lot. This includes computer vision, images. What other modalities?

11:53: Text data, health records, responses over time, blood biomarkers, RNA-Seq data. The amount of data that has been generated is quite incredible. These are all different data modalities with different structures, different ways of correcting for noise, batch effects, and understanding human systems.

12:51: When you run into your former colleagues at DeepMind, what kinds of requests do you give them?

13:14: Forget about the chatbots. A lot of the work that’s happening around large language models—thinking of LLMs as productivity tools that can help. But there has also been a lot of exploration around building larger frameworks where we can do inference. The challenge is around data. Health data is very sparse. That’s one of the challenges. How do we fine-tune models to specific solutions or specific disease areas or specific modalities of data? There’s been a lot of work on foundation models for computational pathology or foundations for single cell structure. If I had one wish, it would be looking at small data and how do you have robust patient representations when you have small datasets? We’re generating large amounts of data on small numbers of patients. This is a big methodological challenge. That’s the North Star.

15:12: When you describe using these foundation models to generate synthetic data, what guardrails do you put in place to prevent hallucination?

15:30: We’ve had a responsible AI team since 2019. It’s important to think of those guardrails especially in health, where the rewards are high but so are the stakes. One of the things the team has implemented is AI principles, but we also use model cards. We have policymakers understanding the consequences of the work; we also have engineering teams. There’s a team that looks precisely at understanding hallucinations with the language model we’ve built internally, called Jules.1 There’s been a lot of work looking at metrics of hallucination and accuracy for those models. We also collaborate on things like interpretability and building reusable pipelines for responsible AI. How can we identify the blind spots in our analysis?

17:42: Last year, a lot of people started doing fine-tuning, RAG, and GraphRAG; I assume you do all of these?

18:05: RAG happens a lot in the responsible AI team. We have built a knowledge graph. That was one of the earliest knowledge graphs—before I joined. It’s maintained by another team at the moment. We have a platforms team that deals with all the scaling and deploying across the company. Tools like knowledge graph aren’t just AI/ML. Also Jules—it’s maintained outside AI/ML. It’s exciting when you see these solutions scale.

20:02: The buzzy term this year is agents and even multi-agents. What is the state of agentic AI within GSK?

20:18: We’ve been working on this for quite a while, especially within the context of large language models. It allows us to leverage a lot of the data that we have internally, like clinical data. Agents are built around those datatypes and the different modalities of questions that we have. We’ve built agents for genetic data or lab experimental data. An orchestral agent in Jules can combine those different agents in order to draw inferences. That landscape of agents is really important and relevant. It gives us refined models on individual questions and types of modalities.

21:28: You alluded to personalized medicine. We’ve been talking about that for a long time. Can you give us an update? How will AI accelerate that?

21:54: This is a field I’m really optimistic about. We have had a lot of impact; sometimes when you have your nose to the glass, you don’t see it. But we’ve come a long way. First, through data: We have exponentially more data than we had 15 years ago. Second, compute power: When I started my PhD, the fact that I had a GPU was amazing. The scale of computation has accelerated. And there has been a lot of influence from science as well. There has been a Nobel Prize for protein folding. Understanding of human biology is something we’ve pushed the needle on. A lot of the Nobel Prizes were about understanding biological mechanisms, understanding basic science. We’re currently on building blocks towards that. It took years to get from understanding the ribosome to understanding the mechanism for HIV.

23:55: In AI for healthcare, we’ve seen more immediate impacts. Just the fact of understanding something heterogeneous: If we both get a diagnosis of asthma, that will have different manifestations, different triggers. That understanding of heterogeneity in things like mental health: We are different; things need to be treated differently. We also have the ecosystem, where we can have an impact. We can impact clinical trials. We are in the pipeline for drugs.

25:39: One of the pieces of work we’ve published has been around understanding differences in response to the drug for hepatitis B.

26:01: You’re in the UK, you have the NHS. In the US, we still have the data silo problem: You go to your primary care, and then a specialist, and they have to communicate using records and fax. How can I be optimistic when systems don’t even talk to each other?

26:36: That’s an area where AI can help. It’s not a problem I work on, but how can we optimize workflow? It’s a systems problem.

26:59: We all associate data privacy with healthcare. When people talk about data privacy, they get sci-fi, with homomorphic encryption and federated learning. What’s reality? What’s in your daily toolbox?

27:34: These tools are not necessarily in my daily toolbox. Pharma is heavily regulated; there’s a lot of transparency around the data we collect, the models we built. There are platforms and systems and ways of ingesting data. If you have a collaboration, you often work with a trusted research environment. Data doesn’t necessarily leave. We do analysis of data in their trusted research environment, we make sure everything is privacy preserving and we’re respecting the guardrails.

29:11: Our listeners are mainly software developers. They may wonder how they enter this field without any background in science. Can they just use LLMs to speed up learning? If you were trying to sell an ML developer on joining your team, what kind of background do they need?

29:51: You need a passion for the problems that you’re solving. That’s one of the things I like about GSK. We don’t know everything about biology, but we have very good collaborators.

30:20: Do our listeners need to take biochemistry? Organic chemistry?

30:24: No, you just need to talk to scientists. Get to know the scientists, hear their problems. We don’t work in silos as AI researchers. We work with the scientists. A lot of our collaborators are doctors, and have joined GSK because they want to have a bigger impact.

Footnotes

Not to be confused with Google’s recent agentic coding announcement.