My research centers on reinforcement learning, with a strong focus on risk and uncertainty. I built the Arcade Learning Environment — the Atari 2600 benchmark that launched deep reinforcement learning. At DeepMind and later Google Brain, I pioneered the distributional approach to RL, designed curiosity-driven agents, and led the first major commercial deployment of RL on Loon’s stratospheric balloons. More recently, I’ve been extending all of these ideas to large language models.
I co-founded Reliant AI in 2023 to make the world’s scientific knowledge useful and accessible. Life-sciences researchers spend enormous amounts of time searching, filtering, and synthesizing evidence; we’re building AI systems to do this work at scale, with the rigour scientific inference demands. I previously led the RL team at Google Brain in Montreal and was part of the original research group at DeepMind.
I care deeply about building AI that’s both scientifically rigorous and genuinely useful — and I’m lucky to get to do this work with people who share that obsession.
Reliant AI develops AI software for biopharma. Our AI agents help healthcare and life sciences teams make better decisions from complex scientific and regulatory data. Doing this well requires algorithms that understand that scientific research is constantly evolving.
Distributional RL puts risk and randomness at the center of decision-making. Since our original 2017 paper, its unique perspective has led to critical scientific discoveries from economics to neuroscience.
At Google Brain, my team collaborated with Loon to use RL to fly superpressure balloons 20km above the earth. Through a series of experiments and production deployments, we demonstrated massively improved performance compared to the original hand-coded controller.
The ALE kicked off the field of deep reinforcement learning by turning the whole gamut of Atari 2600 video games into one big RL benchmark. Made popular by our DQN paper (Nature, 2015), it spurred numerous follow-ups from OpenAI's Universe and ProcGen to our own Hanabi Learning Environment.
Dopamine is a lightweight research framework for fast prototyping of deep reinforcement learning algorithms. It provides clean reference implementations for ALE-relevant algorithms and techniques.
rliable is a Python library for statistically rigorous evaluation of reinforcement learning algorithms. It's become a staple of RL experimental analysis, instantly recognizable from its colour palette.