Research Statement

Michael Yao

Machine learning and artificial intelligence are changing the practice of medicine. However, current methods are frequently trained and evaluated on synthetic tasks that bear little resemblance to clinical reasoning. This makes it challenging to know when we can safely trust ML systems. My goal is to safely translate clinical AI tools from sandbox environments into real-world clinical settings. To do this, I am interested in:

Bridging Virtual Environments to the Real World

I am developing new methods to translate algorithms trained in synthetic environments to real-world settings (ICML 2025, NeurIPS 2024). This allows us to develop tools in silico and deploy them into clinical practice. Current work is focused on how we can take advantage of medical knowledge to better adapt ML systems for patient care (Preprint 2025, Nature Communications 2024, NeurIPS 2024).

Evaluating Clinical Algorithms on Realistic Tasks

In addition to developing better algorithms, we can also improve our synthetic environments to better mimic clinical settings. To this end, I am building realistic, scalable test beds for algorithm development that better capture the dynamics of real clinical workflows (Communications Medicine 2025, ML4H 2022).

Understanding Distribution Shifts in the Wild

I am characterizing the magnitude of differences between patient populations, hospital systems, and local communities that affect the performance of ML systems. Understanding when and where distribution shifts occur enables us to better mitigate their effects in clinical applications (Preprint 2025, Radiology 2024).