<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Marc G. Bellemare</title><link>https://marcgbellemare.info/en/</link><description>Recent content on Marc G. Bellemare</description><generator>Hugo</generator><language>en</language><copyright>© {year} Marc G. Bellemare</copyright><lastBuildDate>Thu, 01 Jan 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://marcgbellemare.info/en/index.xml" rel="self" type="application/rss+xml"/><item><title>Compositional Planning with Jumpy World Models</title><link>https://marcgbellemare.info/en/publications/farebrother26compositional/</link><pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/farebrother26compositional/</guid><description/></item><item><title>Convergence Theorems for Entropy-Regularized and Distributional Reinforcement Learning</title><link>https://marcgbellemare.info/en/publications/jhaveri25convergence/</link><pubDate>Tue, 01 Jul 2025 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/jhaveri25convergence/</guid><description/></item><item><title>Tapered Off-Policy REINFORCE: Stable and Efficient Reinforcement Learning for LLMs</title><link>https://marcgbellemare.info/en/publications/leroux25tapered/</link><pubDate>Tue, 01 Jul 2025 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/leroux25tapered/</guid><description/></item><item><title>A Distributional Analogue to the Successor Representation</title><link>https://marcgbellemare.info/en/publications/wiltzer24successor/</link><pubDate>Mon, 01 Jul 2024 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/wiltzer24successor/</guid><description/></item><item><title>Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning</title><link>https://marcgbellemare.info/en/publications/wiltzer24continuous/</link><pubDate>Mon, 01 Jul 2024 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/wiltzer24continuous/</guid><description/></item><item><title>An Analysis of Quantile Temporal-Difference Learning</title><link>https://marcgbellemare.info/en/publications/rowland24quantile/</link><pubDate>Mon, 01 Jul 2024 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/rowland24quantile/</guid><description/></item><item><title>Controlling Large Language Model Agents with Entropic Activation Steering</title><link>https://marcgbellemare.info/en/publications/rahn24steering/</link><pubDate>Mon, 01 Jul 2024 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/rahn24steering/</guid><description/></item><item><title>A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces</title><link>https://marcgbellemare.info/en/publications/lelan23subspaces/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/lelan23subspaces/</guid><description/></item><item><title>Bigger, Better, Faster: Human-level Atari with Human-Level Efficiency</title><link>https://marcgbellemare.info/en/publications/schwarzer23bigger/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/schwarzer23bigger/</guid><description/></item><item><title>Bootstrapped Representations in Reinforcement Learning</title><link>https://marcgbellemare.info/en/publications/lelan23bootstrapped/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/lelan23bootstrapped/</guid><description/></item><item><title>Discovering the Electron Beam Induced Transition Rates for Silicon Dopants in Graphene with Deep Neural Networks in the STEM</title><link>https://marcgbellemare.info/en/publications/roccapriore23electron/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/roccapriore23electron/</guid><description/></item><item><title>Investigating Multi-Task Pretraining and Generalization in Reinforcement Learning</title><link>https://marcgbellemare.info/en/publications/taiga23investigating/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/taiga23investigating/</guid><description/></item><item><title>Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control</title><link>https://marcgbellemare.info/en/publications/rahn23policy/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/rahn23policy/</guid><description/></item><item><title>Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks</title><link>https://marcgbellemare.info/en/publications/farebrother23proto/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/farebrother23proto/</guid><description/></item><item><title>Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier</title><link>https://marcgbellemare.info/en/publications/doro23sample/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/doro23sample/</guid><description/></item><item><title>Small Batch Deep Reinforcement Learning</title><link>https://marcgbellemare.info/en/publications/obandoceron23small/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/obandoceron23small/</guid><description/></item><item><title>The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation</title><link>https://marcgbellemare.info/en/publications/rowland23benefits/</link><pubDate>Sat, 01 Jul 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/rowland23benefits/</guid><description/></item><item><title>Distributional Reinforcement Learning</title><link>https://marcgbellemare.info/en/books/distributional-rl/</link><pubDate>Tue, 30 May 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/books/distributional-rl/</guid><description>&lt;p&gt;&lt;strong&gt;Marc G. Bellemare, Will Dabney, Mark Rowland&lt;/strong&gt;
MIT Press, May 2023 · 384 pages · Adaptive Computation and Machine Learning series&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://mitpress.mit.edu/9780262048019/distributional-reinforcement-learning/"&gt;MIT Press page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.distributional-rl.org/"&gt;Book website&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://direct.mit.edu/books/oa-monograph/5590/Distributional-Reinforcement-Learning"&gt;Open-access PDF&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;ISBN (hardcover): 9780262048019 · ISBN (eBook): 9780262374019&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;p&gt;This is the first comprehensive guide to distributional reinforcement learning, providing a new mathematical formalism for thinking about decisions from a probabilistic perspective. Rather than computing expected values, the book focuses on how total reward behaves as a probability distribution — presenting core concepts, mathematical proofs, and algorithmic developments for characterising, computing, estimating, and making decisions based on random returns. Applications span finance, computational neuroscience, psychology, macroeconomics, and robotics.&lt;/p&gt;</description></item><item><title>Distributional Reinforcement Learning</title><link>https://marcgbellemare.info/en/publications/bellemare23book/</link><pubDate>Sat, 01 Apr 2023 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/bellemare23book/</guid><description/></item><item><title>Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning</title><link>https://marcgbellemare.info/en/publications/wiltzer22hjb/</link><pubDate>Fri, 01 Jul 2022 00:00:00 +0000</pubDate><guid>https://marcgbellemare.info/en/publications/wiltzer22hjb/</guid><description/></item></channel></rss>