Research
Academia
I hold a Canada CIFAR AI Chair at Mila. I supervise a few graduate students as an adjunct at McGill University and Université de Montréal, both of which I’m proud to call academic homes. My students typically, although not always, end up taking industry research positions. I also have the privilege of being an Associate Fellow of CIFAR’s Learning in Machines and Brains program.
In the wild
My research on curiosity-driven, game-playing agents was covered in Brian Christian’s excellent book The Alignment Problem. Brian also put together a cool TED-Ed video that talks about those agents and our Atari video game-playing work in general.
In the second half of 2020, our reinforcement learning agent flew Loon’s stratospheric balloons over Kenya and Peru, delivering internet service to users on the ground. The agent’s distinctive arabesques (shown below) were captured by FlightRadar24.

Books
Papers
2026
- Compositional Planning with Jumpy World Models
Manuscript under review, 2026
2025
- Convergence Theorems for Entropy-Regularized and Distributional Reinforcement Learning
Advances in Neural Information Processing Systems (NeurIPS), 2025 - Tapered Off-Policy REINFORCE: Stable and Efficient Reinforcement Learning for LLMs
Advances in Neural Information Processing Systems (NeurIPS), 2025
2024
- A Distributional Analogue to the Successor Representation
Proceedings of the International Conference on Machine Learning (ICML), 2024 - Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning
Advances in Neural Information Processing Systems (NeurIPS), 2024 - An Analysis of Quantile Temporal-Difference Learning
Journal of Machine Learning Research (JMLR), 2024 - Controlling Large Language Model Agents with Entropic Activation Steering
ICML Workshop on Mechanistic Interpretability, 2024
2023
- A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2023 - Bigger, Better, Faster: Human-level Atari with Human-Level Efficiency
Proceedings of the International Conference on Machine Learning (ICML), 2023 - Bootstrapped Representations in Reinforcement Learning
Proceedings of the International Conference on Machine Learning (ICML), 2023 - Discovering the Electron Beam Induced Transition Rates for Silicon Dopants in Graphene with Deep Neural Networks in the STEM
Microscopy and Analysis, 2023 - Investigating Multi-Task Pretraining and Generalization in Reinforcement Learning
Proceedings of the International Conference on Learning Representations (ICLR), 2023 - Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control
Advances in Neural Information Processing Systems (NeurIPS), 2023 - Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks
Proceedings of the International Conference on Learning Representations (ICLR), 2023 - Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Proceedings of the International Conference on Learning Representations (ICLR), 2023 - Small Batch Deep Reinforcement Learning
Advances in Neural Information Processing Systems (NeurIPS), 2023 - The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
Proceedings of the International Conference on Machine Learning (ICML), 2023
2022
- Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning
Proceedings of the International Conference on Machine Learning (ICML), 2022 - On the Generalization of Representations in Reinforcement Learning
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2022 - Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Advances in Neural Information Processing Systems (NeurIPS), 2022 - The Nature of Temporal Difference Errors in Multi-Step Distributional Reinforcement Learning
Advances in Neural Information Processing Systems (NeurIPS), 2022
2021
- Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
Proceedings of the International Conference on Learning Representations (ICLR), 2021 - Deep Reinforcement Learning at the Edge of the Statistical Precipice Best paper award
Advances in Neural Information Processing Systems (NeurIPS), 2021 - Metrics and Continuity in Reinforcement Learning
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021 - The Importance of Pessimism in Fixed-Dataset Policy Optimization
Proceedings of the International Conference on Learning Representations (ICLR), 2021 - The Value-Improvement Path: Towards Better Representations for Reinforcement Learning
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021
2020
- A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2020 - Algorithmic Improvements for Deep Reinforcement Learning Applied to Interactive Fiction
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020 - Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment Best paper award at ICML 2019 Exploration in RL Workshop
Proceedings of the International Conference on Learning Representations (ICLR), 2020 - Count-Based Exploration with the Successor Representation
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020 - Representations for Stable Off-Policy Reinforcement Learning
Proceedings of the International Conference on Machine Learning (ICML), 2020 - Shaping the Narrative Arc: An Information-Theoretic Approach to Collaborative Dialogue
Proceedings of the International Conference on Computational Creativity (ICCC), 2020
2019
- A Comparative Analysis of Expected and Distributional Reinforcement Learning
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2019 - A Geometric Perspective on Optimal Representations for Reinforcement Learning
Advances in Neural Information Processing Systems (NeurIPS), 2019 - An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2019 - DeepMDP: Learning Continuous Latent Space Models for Representation Learning
Proceedings of the International Conference on Machine Learning (ICML), 2019 - Distributional Reinforcement Learning with Linear Function Approximation
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2019 - Hyperbolic Discounting and Learning over Multiple Horizons Best paper award
Reinforcement Learning and Decision Making Symposium (RLDM), 2019 - Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2019 - Statistics and Samples in Distributional Reinforcement Learning
Proceedings of the International Conference on Machine Learning (ICML), 2019 - Temporally Extended Metrics for Markov Decision Processes
SafeAI Workshop at AAAI, 2019 - The Value Function Polytope in Reinforcement Learning
Proceedings of the International Conference on Machine Learning (ICML), 2019
2018
- An Analysis of Categorical Distributional Reinforcement Learning
Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2018 - Approximate Exploration Through State Abstraction
ICML Workshop on Exploration in Reinforcement Learning, 2018 - Distributional Reinforcement Learning with Quantile Regression
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2018 - Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
Journal of Artificial Intelligence Research, 2018 - The Barbados 2018 List of Open Issues in Continual Learning
NeurIPS Workshop on Continual Learning, 2018 - The Reactor: A Fast and Sample-Efficient Actor-Critic Agent for Reinforcement Learning
Proceedings of the International Conference on Learning Representations (ICLR), 2018
2017
- A Distributional Perspective on Reinforcement Learning
Proceedings of the International Conference on Machine Learning (ICML), 2017 - A Laplacian Framework for Option Discovery in Reinforcement Learning
Proceedings of the International Conference on Machine Learning (ICML), 2017 - Automatic Curriculum Learning for Neural Networks
Proceedings of the International Conference on Machine Learning (ICML), 2017 - Count-Based Exploration with Neural Density Models
Proceedings of the International Conference on Machine Learning (ICML), 2017
2016
- Increasing the Action Gap: New Operators for Reinforcement Learning
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2016 - Safe and Efficient Off-Policy Reinforcement Learning
Advances in Neural Information Processing Systems (NeurIPS), 2016 - Unifying Count-Based Exploration and Intrinsic Motivation
Advances in Neural Information Processing Systems (NeurIPS), 2016
2015
- Count-Based Frequency Estimation with Bounded Memory
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2015 - Online Learning of k-CNF Boolean Functions
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2015
2014
- Skip Context Tree Switching
Proceedings of the International Conference on Machine Learning (ICML), 2014
2013
- Bayesian Learning of Recursively Factored Environments
Proceedings of the International Conference on Machine Learning (ICML), 2013 - Fast, Scalable Algorithms for Reinforcement Learning in High-Dimensional Domains
PhD Thesis, University of Alberta, 2013 - The Arcade Learning Environment: An Evaluation Platform for General Agents
Journal of Artificial Intelligence Research, 2013
2012
- Investigating Contingency Awareness Using Atari 2600 Games
Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2012 - Sketch-Based Linear Value Function Approximation
Advances in Neural Information Processing Systems (NeurIPS), 2012
2011
- A Primer on Reinforcement Learning in the Brain: Psychological, Computational and Neural Perspectives
Computational Neuroscience for Advancing Artificial Intelligence (IGI Global), 2011
2007
- Constructing Evidence-Based Treatment Strategies Using Methods from Computer Science
Drug and Alcohol Dependence, 2007 - Context-Driven Predictions
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2007 - Learning Prediction and Abstraction in Partially Observable Models
Master's Thesis, McGill University, 2007
2006
- Cascade Correlation Algorithms for On-Line Reinforcement Learning
Proceedings of the North East Student Colloquium on Artificial Intelligence (NESCAI), 2006