Marc G. Bellemare

I build machines that learn by doing.

Chief Scientific Officer: Reliant AI
Canada CIFAR AI Chair: Mila

My research centers on reinforcement learning, with a strong focus on risk and uncertainty. I built the Arcade Learning Environment — the Atari 2600 benchmark that launched deep reinforcement learning. At DeepMind and later Google Brain, I pioneered the distributional approach to RL, designed curiosity-driven agents, and led the first major commercial deployment of RL on Loon’s stratospheric balloons. More recently, I’ve been extending all of these ideas to large language models.

I co-founded Reliant AI in 2023 to make the world’s scientific knowledge useful and accessible. Life-sciences researchers spend enormous amounts of time searching, filtering, and synthesizing evidence; we’re building AI systems to do this work at scale, with the rigour scientific inference demands. I previously led the RL team at Google Brain in Montreal and was part of the original research group at DeepMind.

I care deeply about building AI that’s both scientifically rigorous and genuinely useful — and I’m lucky to get to do this work with people who share that obsession.

Projects

Reliant AI

Reliant AI develops AI software for biopharma. Our AI agents help healthcare and life sciences teams make better decisions from complex scientific and regulatory data. Doing this well requires algorithms that understand that scientific research is constantly evolving.

Distributional Reinforcement Learning

Distributional RL puts risk and randomness at the center of decision-making. Since our original 2017 paper, its unique perspective has led to critical scientific discoveries from economics to neuroscience.

Our book at MIT Press (2023)

Navigating Balloons in the Stratosphere

At Google Brain, my team collaborated with Loon to use RL to fly superpressure balloons 20km above the earth. Through a series of experiments and production deployments, we demonstrated massively improved performance compared to the original hand-coded controller.

Our paper in Nature (2020)Balloon Learning Environment

The Arcade Learning Environment

The ALE kicked off the field of deep reinforcement learning by turning the whole gamut of Atari 2600 video games into one big RL benchmark. Made popular by our DQN paper (Nature, 2015), it spurred numerous follow-ups from OpenAI's Universe and ProcGen to our own Hanabi Learning Environment.

Our article in JAIR (2013)ALE at Farama Foundation

Dopamine

Dopamine is a lightweight research framework for fast prototyping of deep reinforcement learning algorithms. It provides clean reference implementations for ALE-relevant algorithms and techniques.

Technical report (2018)

rliable

rliable is a Python library for statistically rigorous evaluation of reinforcement learning algorithms. It's become a staple of RL experimental analysis, instantly recognizable from its colour palette.

Our award-winning paper at NeurIPS (2021)

View publications →