Delphi-2M Predicts Which Disease Will Strike You In the Next 20 Years

Here’s how a GPT-style model, Delphi-2M, can predict your future disease risks, based on your current medical record.

Oct 02, 2025

I want to introduce you to Ayushi Bamania as the guest author for this week's newsletter.

Ayushi is the writer of The Science Spectrum where she posts insightful articles about the use of technology that improves the health and lives of people.

Medical professionals are undoubtedly heroes, of course. They perform extraordinary work, saving lives and healing patients daily. However, they cannot foresee which disease might develop in a patient over the next few years.

I mean, who can predict what is going to happen in the coming 20 years?

But, what if it was true? What if you could know now what deadly disease is going to appear in your body in the next 20 years?

An AI model called the Delphi-2M is here to make this possible.

Scientists have recently developed a GPT-based model trained on health data, which can predict the likelihood of developing more than 1,000 different diseases based on an individual’s past health history.

It not only tells you that you might get an “xyz” disease, but it also provides the estimated timing of when the disease may occur, for example, 1 year, 5 years, or 10 years.

Delphi-2M showed an AUC score of 0.76 (out of 1.0) across all diseases, which is pretty impressive.

Let’s now learn how scientists developed this model, which may revolutionize the concept of preventive medicine.

Development of Delphi-2M

The data used to train the model primarily came from the UK Biobank, a huge health database of about 500k people from across the UK.

For Delphi-2M, the researchers used a specific database called the “First occurrence data.” It records the very first time someone got diagnosed with each disease.

Furthermore, to ensure that Delphi-2M worked beyond just UK data, the scientists also used data from Denmark’s incredible health record system.

Denmark tracks and records every citizen’s health throughout their lifetime using a unique ID number. Their data covers about 796 different types of diagnoses from approximately 1.9 million people.

Delphi-2M is built on the same foundation as the technology behind ChatGPT. The difference is that it is designed to predict health problems instead of generating text.

Just as GPT understands and learns how language works by reading millions of texts, Delphi-2M learns how diseases develop by studying health records.

To train the model, every medical event in the data was treated as a token in language modelling. During training, the model learned to predict both the next event and the waiting time until the next event.

For example: “High blood pressure — 2 years later” or “Cardiac arrest — 5 years later”

Each of these lines from the data became a Bivariate pair, which helps the model not only predict which event will happen but also when it is likely to occur.

The model also uses something called the Competing Exponentials, which again helps the model to naturally capture both the likelihood of disease and the expected time until it appears.

Think of the exponentials as many stopwatches or timers, one for each possible disease a person might develop. For instance, one timer for heart disease, one for cancer, and one for kidney disease.

When the researchers enter the person’s medical history, the model assigns a “tick speed” to each disease.

Each of these timers has its own speed, which depends on a person’s medical history and the associated risk factors. For example, if someone has high cholesterol levels, the heart disease timer may tick faster, while for rare diseases, the timer ticks more slowly.

Next, all the timers start at the same moment, and they compete with each other to see which one of them rings first. Whichever timer rings first, the model predicts that the event will happen next.

For example, if the “cancer timer” ticks before the “heart disease timer,” the model will predict that cancer will appear next, and also tell you when it expects that to happen.

Thus, using the competing exponentials, the model captures the competition between diseases, as a person may be at risk for multiple conditions simultaneously in real life.

Additionally, diseases don’t occur at regular intervals in real life. Sometimes, nothing happens for years, and then a diagnosis may appear suddenly. The exponential timers handle these irregular gaps as well.

The interesting part about Delphi is that it looks at general health problems in the population, and then it adjusts those predictions for each person by adding their own health history. For example, for conditions like asthma or joint pain, the predictions are almost the same as the average for people of the same age and gender.

Here, personal history doesn’t matter much. But for predicting the risk of severe infection, the risk can differ from person to person.

A person who has weak immunity or has diabetes may have a higher chance of developing a severe infection than a person who is generally healthy. Thus, Delphi’s predictions vary between individuals, showing that it can spot important differences in individual health risks.

Let’s now dive into the impressive results shown by the Delphi-2M model.

How Well Does Delphi-2M Actually Work

To check how accurate the predictions of the Delphi-2M model are, the scientists use the AUC score, as we discussed earlier.

The model scores a 0.76 across all the diseases, which is quite impressive.

For 97% of all the diseases, it scores much better than just random guessing, which means that every disease follows some predictable pattern.

Surprisingly, the model gives the most accurate prediction of death. With an AUC score of 0.97, Delphi proves that it is extremely good at predicting when someone might die.

The image shows examples of various diseases, as well as death. Each dot here is a prediction. Darker dots represent the predictions made right before the disease was diagnosed. Purple and turquoise coloured lines show the actual disease rates in the population, and the black lines show how Delphi’s risk prediction changes for that person as they get older, until the disease is finally diagnosed. (Source: https://www.nature.com/articles/s41586-025-09529-3/figures/2)

Another huge advantage of Delphi-2M is that it can predict about 1000 different diseases simultaneously at any point in someone’s life.

Most of the tools and computer programs available today can predict specific diseases, such as heart problems or cancer, but very few can predict the full range of human diseases. Delphi-2M outperforms or matches many current single-disease prediction tools while offering the unique advantage of being able to predict a person’s risk for almost every human disease.

The image compares Delphi-2M with other models for predicting death, dementia, and heart disease. Delphi-2M usually performs better than or equal to other models. (Source: https://www.nature.com/articles/s41586-025-09529-3/figures/2)

Here is another interesting feature of Delphi: it can create complete, realistic stories for imaginary people.

When the researchers test the “fake-data-only” model of Delphi, it gives an accuracy score of 0.74.

This could be a massive advantage for researchers, as they can train powerful health prediction AI systems using entirely artificial data. This would potentially protect real patients’ vital information and still give the same results.

Panel A shows the setup where the researchers used real health records of approximately 63,600 people aged 60. After people turn 60, Delphi attempts to predict their future health path, as shown by the orange lines. The simulated health events were then compared with the real health records after age 60, shown by the blue lines. In Panel B, the scatterplot shows a comparison between the disease rate predicted by Delphi-2M and the actual observed disease rate for individuals aged 70–75. The dots falling near the diagonal line indicate that the predictions made by Delphi align very well with reality. The line graph in Panel C shows how many predictions Delphi gets right when simulating after age 60. The orange stays higher, indicating that Delphi's prediction is better. (Source: https://www.nature.com/articles/s41586-025-09529-3/figures/3)

Let’s now understand how scientists tracked the influence of past diseases on future predictions.

To do this, they used SHAP analysis.

Let me break it down in simple terms.

SHAP, or Shapley Additive Explanations, is simply a way to answer the question of ‘which past event made the AI model predict that?’

Let’s take an example. Imagine the prediction is a team result, and every past event (like a diagnosis, a lab test, or a medication) is a player on that team.

For this team, SHAP asks, if we add each player one by one in all possible orders, how much does that player change the team’s final score on average?

The results of SHAP analysis indicate that recent events matter the most, long-standing illnesses increase the risk, and protective or healthy signals push this risk down.

The image shows how SHAP explains which past disease or event most influenced the prediction for a particular person. The top chart shows a person just before they were diagnosed with pancreatic cancer at age 68. Their predicted risk of pancreatic cancer was found to be 19 times higher right before the diagnosis was made. The bottom chart shows another patient’s risk of death at the age of 63.5. Their risk of dying greatly increased because they had recently been diagnosed with pancreatic cancer. (Source: https://www.nature.com/articles/s41586-025-09529-3/figures/4)

From helping the health professionals catch the disease early to planning healthcare for the ageing population, Delphi’s use could transform medicine in the future.

Instead of looking at each disease separately, we can now view them as interconnected health events that build upon one another over a person’s lifetime. Instead of waiting for the symptoms, we might catch the disease years earlier.

While the technology of Delphi-2M may need more improvements, it is a big step forward in using AI to understand and predict human health.

References

Research paper titled “Learning the natural history of human disease with generative transformers” from Nature
Article titled “Which diseases will you have in 20 years? This AI accurately predicts your risks.”

I’d again like to thank Ayushi Bamania for writing this article for ‘Into AI’.

Don’t forget to subscribe to her newsletter The Science Spectrum where she shares insightful articles about the use of technology in improving the health and lives of people.

A guest post by

Ayushi Bamania

I share valuable insights on the uses of technology in health that redefine the future and our lives.

Into AI

Discussion about this post

Ready for more?