Can AI Models Transform Clinical Research?

September 25, 2025

In a paper published in the prestigious peer-reviewed journal, New England Journal of Medicine AI, a team of Providence and Microsoft researchers outline an AI framework that could accelerate biomedical discovery.

Electronic medical records (EMRs) are a vital tool used by healthcare providers and systems to securely capture patient information and improve care. Taken as a whole, EMRs contain a vast amount of often unstructured information. Yet Providence Genomics Chief Medical Officer Carlo Bifulco, MD, and his colleagues see real potential for using Artificial Intelligence (AI) to organize and utilize this data to generate evidence that advances healthcare treatments and care.

In a new paper published in the peer-reviewed journal New England Journal of Medicine AI, Bifulco and his collaborators from Providence Genomics and Microsoft Research propose a novel framework that uses AI techniques to simulate clinical trials and accelerate biomedical discovery. It’s an approach that they believe could eventually transform how researchers design clinical trials and generate evidence.

Read on for highlights, or read the full paper at NEJM AI.

Randomized clinical trials vs. real-world evidence

Randomized controlled trials (RCTs) are the current gold standard for generating evidence that can lead to new treatments or cures. RCTs are research studies designed to answer specific questions about the effectiveness and safety of new products or new ways of preventing or treating diseases in relation to the existing standard of care. While RCTs are a vital driver of clinical innovation, they can also be time-consuming, costly, and difficult to conduct. 

Real-world evidence, on the other hand, refers to evidence derived from data routinely collected from clinical sources, such as data saved in a health system’s electronic medical record (EMR). The researchers' paper suggests real-world evidence could significantly augment traditional RCTs, expand the scope and utility of EMRs, and improve the speed, quality and cost involved in generating evidence.

A framework for generating real-world evidence with AI

In the paper, the researchers propose an innovative framework dubbed TRIALSCOPE to facilitate the generation of real-world evidence from EMR data.

The framework uses AI to organize unstructured and noisy patient data and then uses state-of-the-art causal inference techniques to estimate the effects of treatments. It also tests how changes in criteria for who can participate in clinical trials could impact trial outcomes, giving researchers another tool to shape more effective trials for specific patient populations. 

The framework consists of five main components:

  1. Data Structuring Pipeline: This component uses AI large language models to convert clinical data into structured representations of patient journeys.
  2. Probabilistic latent variable model: This model helps to organize and clean the patient data.
  3. Patient Triaging Pipeline: This component identifies real-world patients according to target trial eligibility criteria.
  4. Causal Model: This model simulates the target trial and estimates treatment effects.
  5. Validation Tests: These tests help assess the trustworthiness of the simulation results.

Case study: Lung cancer trials

To validate the model’s effectiveness, the team used the framework to replicate 11 previously conducted RCTs focused on advanced non-small cell lung cancer. This allowed them to evaluate whether the evidence produced by the model was consistent with evidence from previous clinical trials. 

Overall, the simulation results aligned closely with the previously published trials, demonstrating the model’s potential to generate evidence that complements the evidence generated through traditional RCTs. The model also correctly replicated the results of unsuccessful trials, a capability that Providence Genomics Program Director Brian Piening, PhD, says could be very important for the field.

“Given that many clinical trials ultimately fail to reach endpoints, having up-front predictions from real-world data could potentially save time, money and effort in running trials predicted to fail.” - Brian Piening, PhD, program director, Providence Geomics.

Additional tests showed that the model produced an increase in the scope, size and quality of patient records, and improvements in speed and cost.

Conclusion

According to Dr. Bifulco, "this model could transform the way researchers and scientists conduct clinical trials and generate real world evidence." This work shows that by automating the curation of EMR data and leveraging advanced AI tools, researchers can generate robust, reliable and cost-effective RWE to help identify effective treatments for specific patient populations. 

To learn more, read the full paper at NEJM AI

Related news and resources

No Previous Articles

Next Article
Emerging hope for patients with small cell lung cancer
Emerging hope for patients with small cell lung cancer

In a new study led by the Providence Swedish Cancer Institute’s Kelly Paulson, M.D., Ph.D., survival rates ...