Research

Academia

I hold a Canada CIFAR AI Chair at Mila. I supervise a few graduate students as an adjunct at McGill University and Université de Montréal, both of which I’m proud to call academic homes. My students typically, although not always, end up taking industry research positions. I also have the privilege of being an Associate Fellow of CIFAR’s Learning in Machines and Brains program.

In the wild

My research on curiosity-driven, game-playing agents was covered in Brian Christian’s excellent book The Alignment Problem. Brian also put together a cool TED-Ed video that talks about those agents and our Atari video game-playing work in general.

In the second half of 2020, our reinforcement learning agent flew Loon’s stratospheric balloons over Kenya and Peru, delivering internet service to users on the ground. The agent’s distinctive arabesques (shown below) were captured by FlightRadar24.

Flight-radar trace of Loon balloons over Kenya

Books

Distributional Reinforcement Learning
Marc G. Bellemare, Will Dabney, Mark Rowland
MIT Press, 2023
[Preprint]
An Introduction to Deep Reinforcement Learning
Vincent François-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
Foundations and Trends in Machine Learning, 2018
[Preprint]

Papers

2026

Compositional Planning with Jumpy World Models
Jesse Farebrother, Matteo Pirotta, Andrea Tirinzoni, Marc G. Bellemare, Alessandro Lazaric, Ahmed Touati
Manuscript under review, 2026

2025

Convergence Theorems for Entropy-Regularized and Distributional Reinforcement Learning
Yash Jhaveri, Harley Wiltzer, Patrick Shafto, Marc G. Bellemare, David Meger
Advances in Neural Information Processing Systems (NeurIPS), 2025
[arXiv]
Tapered Off-Policy REINFORCE: Stable and Efficient Reinforcement Learning for LLMs
Nicolas Le Roux, Marc G. Bellemare, Jonathan Lebensold, Arnaud Bergeron, Joshua Greaves, Alexandre Fréchette, Carolyne Pelletier, Eric Thibodeau-Laufer, Sándor Tóth, Sam Work
Advances in Neural Information Processing Systems (NeurIPS), 2025
[arXiv][Web]

2024

A Distributional Analogue to the Successor Representation
Harley Wiltzer, Jesse Farebrother, Arthur Gretton, Yunhao Tang, André Barreto, Will Dabney, Marc G. Bellemare, Mark Rowland
Proceedings of the International Conference on Machine Learning (ICML), 2024
[arXiv][Web]
Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning
Harley Wiltzer, Marc G. Bellemare, David Meger, Patrick Shafto, Yash Jhaveri
Advances in Neural Information Processing Systems (NeurIPS), 2024
[arXiv][Web]
An Analysis of Quantile Temporal-Difference Learning
Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney
Journal of Machine Learning Research (JMLR), 2024
[arXiv][Web]
Controlling Large Language Model Agents with Entropic Activation Steering
Nathan Rahn, Pierluca D'Oro, Marc G. Bellemare
ICML Workshop on Mechanistic Interpretability, 2024
[arXiv][Web]

2023

A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces
Charline Le Lan, Joshua Greaves, Jesse Farebrother, Mark Rowland, Fabian Pedregosa, Rishabh Agarwal, Marc G. Bellemare
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
[arXiv][Web]
Bigger, Better, Faster: Human-level Atari with Human-Level Efficiency
Max Schwarzer, Johan Samir Obando Ceron, Aaron Courville, Marc G. Bellemare, Rishabh Agarwal, Pablo Samuel Castro
Proceedings of the International Conference on Machine Learning (ICML), 2023
[arXiv][Code][Web]
Bootstrapped Representations in Reinforcement Learning
Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney
Proceedings of the International Conference on Machine Learning (ICML), 2023
[arXiv][Web]
Discovering the Electron Beam Induced Transition Rates for Silicon Dopants in Graphene with Deep Neural Networks in the STEM
Kevin M. Roccapriore, Max Schwarzer, Joshua Greaves, Jesse Farebrother, Rishabh Agarwal, Colton Bishop, Maxim Ziatdinov, Igor Mordatch, Ekin D. Cubuk, Aaron Courville, Pablo Samuel Castro, Marc G. Bellemare, Sergei V. Kalinin
Microscopy and Analysis, 2023
Investigating Multi-Task Pretraining and Generalization in Reinforcement Learning
Adrien Ali Taiga, Rishabh Agarwal, Jesse Farebrother, Aaron Courville, Marc G. Bellemare
Proceedings of the International Conference on Learning Representations (ICLR), 2023
[Web]
Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Nathan Rahn, Pierluca D'Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare
Advances in Neural Information Processing Systems (NeurIPS), 2023
[arXiv][Web]
Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks
Jesse Farebrother, Joshua Greaves, Rishabh Agarwal, Charline Le Lan, Ross Goroshin, Pablo Samuel Castro, Marc G. Bellemare
Proceedings of the International Conference on Learning Representations (ICLR), 2023
[arXiv][Web]
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Pierluca D'Oro, Max Schwarzer, Evgenii Nikishin, Pierre-Luc Bacon, Marc G. Bellemare, Aaron Courville
Proceedings of the International Conference on Learning Representations (ICLR), 2023
[Web]
Small Batch Deep Reinforcement Learning
Johan Samir Obando Ceron, Marc G. Bellemare, Pablo Samuel Castro
Advances in Neural Information Processing Systems (NeurIPS), 2023
[arXiv][Web]
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Mark Rowland, Yunhao Tang, Clare Lyle, Rémi Munos, Marc G. Bellemare, Will Dabney
Proceedings of the International Conference on Machine Learning (ICML), 2023
[arXiv][Web]

2022

Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning
Harley Wiltzer, David Meger, Marc G. Bellemare
Proceedings of the International Conference on Machine Learning (ICML), 2022
[arXiv][Web]
On the Generalization of Representations in Reinforcement Learning
Charline Le Lan, Stephen Tu, Adam Oberman, Rishabh Agarwal, Marc G. Bellemare
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
[arXiv][Web]
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare
Advances in Neural Information Processing Systems (NeurIPS), 2022
[arXiv][Web]
The Nature of Temporal Difference Errors in Multi-Step Distributional Reinforcement Learning
Yunhao Tang, Rémi Munos, Mark Rowland, Bernardo Avila Pires, Will Dabney, Marc G. Bellemare
Advances in Neural Information Processing Systems (NeurIPS), 2022
[arXiv][Web]

2021

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G. Bellemare
Proceedings of the International Conference on Learning Representations (ICLR), 2021
[arXiv][Web]
Deep Reinforcement Learning at the Edge of the Statistical Precipice Best paper award
Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare
Advances in Neural Information Processing Systems (NeurIPS), 2021
[PDF][Code]
Metrics and Continuity in Reinforcement Learning
Charline Le Lan, Marc G. Bellemare, Pablo Samuel Castro
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021
[arXiv][Web]
The Importance of Pessimism in Fixed-Dataset Policy Optimization
Jacob Buckman, Carles Gelada, Marc G. Bellemare
Proceedings of the International Conference on Learning Representations (ICLR), 2021
[arXiv]
The Value-Improvement Path: Towards Better Representations for Reinforcement Learning
Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021
[arXiv]

2020

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms
Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
[arXiv][Web]
Algorithmic Improvements for Deep Reinforcement Learning Applied to Interactive Fiction
Vishal Jain, William Fedus, Hugo Larochelle, Doina Precup, Marc G. Bellemare
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020
[arXiv]
Autonomous Navigation of Stratospheric Balloons Using Reinforcement Learning
Marc G. Bellemare, Salvatore Candido, Pablo Samuel Castro, Jun Gong, Marlos C. Machado, Subhodeep Moitra, Sameera S. Ponda, Ziyu Wang
Nature, 2020
[Web]
Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment Best paper award at ICML 2019 Exploration in RL Workshop
Adrien Ali Taiga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare
Proceedings of the International Conference on Learning Representations (ICLR), 2020
[arXiv][Web]
Count-Based Exploration with the Successor Representation
Marlos C. Machado, Marc G. Bellemare, Michael Bowling
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020
[arXiv]
Representations for Stable Off-Policy Reinforcement Learning
Dibya Ghosh, Marc G. Bellemare
Proceedings of the International Conference on Machine Learning (ICML), 2020
[arXiv]
Shaping the Narrative Arc: An Information-Theoretic Approach to Collaborative Dialogue
Kory W. Mathewson, Pablo Samuel Castro, Colin Cherry, George Foster, Marc G. Bellemare
Proceedings of the International Conference on Computational Creativity (ICCC), 2020
[arXiv]
The Hanabi Challenge: A New Frontier for AI Research
Nolan Bard, Jakob N. Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H. Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, Iain Dunning, Shibl Mourad, Hugo Larochelle, Marc G. Bellemare, Michael Bowling
Artificial Intelligence, 2020
[arXiv][Code]
Zooming for Efficient Model-Free Reinforcement Learning in Metric Spaces
Ahmed Touati, Adrien Ali Taiga, Marc G. Bellemare
arXiv preprint, 2020
[arXiv]

2019

A Comparative Analysis of Expected and Distributional Reinforcement Learning
Clare Lyle, Pablo Samuel Castro, Marc G. Bellemare
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2019
[arXiv]
A Geometric Perspective on Optimal Representations for Reinforcement Learning
Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taiga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans, Tor Lattimore, Clare Lyle
Advances in Neural Information Processing Systems (NeurIPS), 2019
[arXiv]
An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents
Felipe Petroski Such, Vashisht Madhavan, Rosanne Liu, Rui Wang, Pablo Samuel Castro, Yulun Li, Ludwig Schubert, Marc Bellemare, Jeff Clune, Joel Lehman
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2019
[arXiv][Web]
DeepMDP: Learning Continuous Latent Space Models for Representation Learning
Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, Marc G. Bellemare
Proceedings of the International Conference on Machine Learning (ICML), 2019
[arXiv]
Distributional Reinforcement Learning with Linear Function Approximation
Marc G. Bellemare, Nicolas Le Roux, Pablo Samuel Castro, Subhodeep Moitra
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
[Web]
Hyperbolic Discounting and Learning over Multiple Horizons Best paper award
William Fedus, Carles Gelada, Yoshua Bengio, Marc G. Bellemare, Hugo Larochelle
Reinforcement Learning and Decision Making Symposium (RLDM), 2019
[arXiv]
Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
Carles Gelada, Marc G. Bellemare
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2019
[arXiv]
Statistics and Samples in Distributional Reinforcement Learning
Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney
Proceedings of the International Conference on Machine Learning (ICML), 2019
[arXiv][Web]
Temporally Extended Metrics for Markov Decision Processes
Philip Amortila, Marc G. Bellemare, Prakash Panangaden, Doina Precup
SafeAI Workshop at AAAI, 2019
The Value Function Polytope in Reinforcement Learning
Robert Dadashi, Adrien Ali Taïga, Nicolas Le Roux, Dale Schuurmans, Marc G. Bellemare
Proceedings of the International Conference on Machine Learning (ICML), 2019
[arXiv]

2018

An Analysis of Categorical Distributional Reinforcement Learning
Mark Rowland, Marc G. Bellemare, Will Dabney, Rémi Munos, Yee Whye Teh
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2018
[PDF][arXiv]
Approximate Exploration Through State Abstraction
Adrien Ali Taiga, Aaron Courville, Marc G. Bellemare
ICML Workshop on Exploration in Reinforcement Learning, 2018
[arXiv]
Distributional Reinforcement Learning with Quantile Regression
Will Dabney, Mark Rowland, Marc G. Bellemare, Rémi Munos
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2018
[arXiv]
Dopamine: A Research Framework for Deep Reinforcement Learning
Pablo Samuel Castro, Subhodeep Moitra, Carles Gelada, Saurabh Kumar, Marc G. Bellemare
arXiv preprint, 2018
[arXiv][Code]
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling
Journal of Artificial Intelligence Research, 2018
[arXiv]
The Barbados 2018 List of Open Issues in Continual Learning
Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc G. Bellemare, Doina Precup
NeurIPS Workshop on Continual Learning, 2018
[arXiv]
The Reactor: A Fast and Sample-Efficient Actor-Critic Agent for Reinforcement Learning
Audrunas Gruslys, Will Dabney, Mohammad Geshlaghi Azar, Bilal Piot, Marc G. Bellemare, Rémi Munos
Proceedings of the International Conference on Learning Representations (ICLR), 2018
[PDF][arXiv]

2017

A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare, Will Dabney, Rémi Munos
Proceedings of the International Conference on Machine Learning (ICML), 2017
[PDF][arXiv][Web]
A Laplacian Framework for Option Discovery in Reinforcement Learning
Marlos C. Machado, Marc G. Bellemare, Michael Bowling
Proceedings of the International Conference on Machine Learning (ICML), 2017
[PDF][arXiv]
Automatic Curriculum Learning for Neural Networks
Alex Graves, Marc G. Bellemare, Jacob Menick, Rémi Munos, Koray Kavukcuoglu
Proceedings of the International Conference on Machine Learning (ICML), 2017
[PDF][arXiv]
Count-Based Exploration with Neural Density Models
Georg Ostrovski, Marc G. Bellemare, Aaron van den Oord, Rémi Munos
Proceedings of the International Conference on Machine Learning (ICML), 2017
[PDF][arXiv]
The Cramér Distance as a Solution to Biased Wasserstein Gradients
Marc G. Bellemare, Ivo Danihelka, Will Dabney, Shakir Mohamed, Balaji Lakshminarayanan, Stephan Hoyer, Rémi Munos
arXiv preprint, 2017
[arXiv]

2016

Increasing the Action Gap: New Operators for Reinforcement Learning
Marc G. Bellemare, Georg Ostrovski, Arthur Guez, Philip S. Thomas, Rémi Munos
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2016
[PDF]
Q(λ) with Off-Policy Corrections
Anna Harutyunyan, Marc G. Bellemare, Tom Stepleton, Rémi Munos
Proceedings of Algorithmic Learning Theory (ALT), 2016
[PDF][BibTeX]
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos, Tom Stepleton, Anna Harutyunyan, Marc G. Bellemare
Advances in Neural Information Processing Systems (NeurIPS), 2016
[PDF][BibTeX]
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare, Sriram Srinivasan, Georg Ostrovski, Tom Schaul, David Saxton, Rémi Munos
Advances in Neural Information Processing Systems (NeurIPS), 2016
[PDF][BibTeX]

2015

Compress and Control
Joel Veness, Marc G. Bellemare, Marcus Hutter, Alvin Chua, Guillaume Desjardins
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2015
[PDF]
Count-Based Frequency Estimation with Bounded Memory
Marc G. Bellemare
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2015
[PDF]
Human-Level Control through Deep Reinforcement Learning
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis
Nature, 2015
[PDF][Web]
Online Learning of k-CNF Boolean Functions
Joel Veness, Marcus Hutter, Laurent Orseau, Marc G. Bellemare
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2015
[PDF][arXiv]

2014

Skip Context Tree Switching
Marc G. Bellemare, Joel Veness, Erik Talvitie
Proceedings of the International Conference on Machine Learning (ICML), 2014
[PDF]

2013

Bayesian Learning of Recursively Factored Environments
Marc G. Bellemare, Joel Veness, Michael Bowling
Proceedings of the International Conference on Machine Learning (ICML), 2013
[PDF]
Fast, Scalable Algorithms for Reinforcement Learning in High-Dimensional Domains
Marc G. Bellemare
PhD Thesis, University of Alberta, 2013
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling
Journal of Artificial Intelligence Research, 2013
[PDF][arXiv][Web]

2012

Investigating Contingency Awareness Using Atari 2600 Games
Marc G. Bellemare, Joel Veness, Michael Bowling
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2012
[PDF]
Sketch-Based Linear Value Function Approximation
Marc G. Bellemare, Joel Veness, Michael Bowling
Advances in Neural Information Processing Systems (NeurIPS), 2012
[PDF]

2011

A Primer on Reinforcement Learning in the Brain: Psychological, Computational and Neural Perspectives
Elliot A. Ludvig, Marc G. Bellemare, Keir G. Pearson
Computational Neuroscience for Advancing Artificial Intelligence (IGI Global), 2011
[Web]

2007

Constructing Evidence-Based Treatment Strategies Using Methods from Computer Science
Joelle Pineau, Marc G. Bellemare, A. John Rush, Adrian Ghizaru, Susan A. Murphy
Drug and Alcohol Dependence, 2007
[Web]
Context-Driven Predictions
Marc G. Bellemare, Doina Precup
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2007
[PDF]
Learning Prediction and Abstraction in Partially Observable Models
Marc G. Bellemare
Master's Thesis, McGill University, 2007
[PDF]

2006

Cascade Correlation Algorithms for On-Line Reinforcement Learning
Marc G. Bellemare
Proceedings of the North East Student Colloquium on Artificial Intelligence (NESCAI), 2006
[PDF]