Heterogeneous RBCs via deep multi-agent reinforcement learning
Gabriele, Glielmo, Taboga
Current macroeconomic models with agent heterogeneity can be broadly divided into two main groups. Heterogeneous-agent general equilibrium (GE) models, such as those based on Heterogeneous Agents New Keynesian (HANK) or Krusell-Smith (KS) approaches, rely on GE and 'rational expectations', somewhat unrealistic assumptions that make the models very computationally cumbersome, which in turn limits the amount of heterogeneity that can be modelled. In contrast, agent-based models (ABMs) can flexibly encompass a large number of arbitrarily heterogeneous agents, but typically require the specification of explicit behavioural rules, which can lead to a lengthy trial-and-error model-development process. To address these limitations, we introduce MARL-BC, a framework that integrates deep multi-agent reinforcement learning (MARL) with Real Business Cycle (RBC) models. We demonstrate that MARL-BC can: (1) recover textbook RBC results when using a single agent; (2) recover the results of the mean-field KS model using a large number of identical agents; and (3) effectively simulate rich heterogeneity among agents, a hard task for traditional GE approaches. Our framework can be thought of as an ABM if used with a variety of heterogeneous interacting agents, and can reproduce GE results in limit cases. As such, it is a step towards a synthesis of these often opposed modelling paradigms.
academic
Heterogeneous RBCs via Deep Multi-Agent Reinforcement Learning
Current macroeconomic models with agent heterogeneity can be divided into two major categories. Heterogeneous agent general equilibrium (GE) models, such as those based on HANK or Krusell-Smith (KS) approaches, rely on general equilibrium and "rational expectations" assumptions that are unrealistic and computationally complex, limiting the degree of heterogeneity that can be modeled. In contrast, agent-based models (ABMs) can flexibly incorporate numerous arbitrarily heterogeneous agents but typically require explicit specification of behavioral rules, resulting in lengthy trial-and-error model development processes. To address these limitations, this paper introduces the MARL-BC framework, which combines deep multi-agent reinforcement learning (MARL) with real business cycle (RBC) models.
Macroeconomic modeling traditionally relies on general equilibrium models using representative agents, such as RBC and New Keynesian models. However, a well-known limitation of representative agent models is their inability to account for agent heterogeneity.
Reinforcement learning (RL), particularly multi-agent reinforcement learning (MARL), offers new approaches for modeling heterogeneous agents in macroeconomics. The RL learning paradigm appears to provide a natural synthesis between the extremes of GE and ABM: agents can be boundedly rational and diverse, yet their behavior emerges endogenously from a principled optimization process (learning to maximize rewards).
Developed the MARL-BC Framework: A MARL-based framework extending the classical RBC model to support multiple households with rich and flexible heterogeneity
Demonstrated Training Feasibility: Training using state-of-the-art RL algorithms (PPO, SAC, DDPG) is computationally feasible
Reproduced Classical Results: When using a single agent, textbook RBC results can be recovered
Reproduced Mean-Field Models: When using numerous ex-ante identical agents, mean-field Krusell-Smith model results can be recovered
Supported Rich Heterogeneity: Effectively simulates rich heterogeneity among agents, a task difficult for traditional GE methods
The MARL-BC framework aims to extend the classical RBC model through multi-agent reinforcement learning to support heterogeneous household agents capable of:
Recovering traditional RBC models in the single-agent case
Recovering Krusell-Smith mean-field models with multiple identical agents
Parameter Sharing: Adopts standard MARL parameter sharing paradigm where a single neural network represents all agents, achieving different behaviors through individual features in observations
Independent Learners: Trains independent learners, each accessing only partial information set x_i_t, optimizing approximate best-response policies
Flexible Heterogeneity: Supports arbitrary heterogeneity settings in capital and labor productivity
Unified Framework: Can recover GE results in limiting cases and serve as ABM in general cases
Algorithm Performance: SAC, TD3, and DDPG significantly outperform PPO in convergence speed, with SAC being the most stable learner
Textbook RBC Recovery: Under complete depreciation (δ=1), RL households learn to recover optimal policies, converging to optimal values after approximately 10^4 training steps
Typical RBC Recovery: Under partial depreciation (δ=0.025), learned optimal consumption and labor choices match results computed by Dynare software
Impulse Response Functions: Successfully reproduces standard impulse response functions, statistically consistent with traditional method results
KS Law of Motion: Endogenously emerges with perfectly linear relationships (R² > 0.99), without prior assumptions
Distribution Characteristics: Gini coefficient increases to 0.18 after convergence, approaching the original KS calculation of 0.25
Marginal Propensity to Consume: Learned curves are flat at high wealth and sharply increase at low wealth, consistent with key results from the original KS paper
Heterogeneous Capital Returns KS: By introducing different capital productivity rates, Gini coefficients reach 0.33 (mild heterogeneity) and 0.61 (significant heterogeneity)
Heterogeneous RBC: In 3×3 grid settings with 9 agents, different productivity rates lead to overlapping but distinct wealth levels
Scalability: Successfully scales to hundreds of agents (maximum 529), with SAC maintaining stable high performance across all scales
Specific Economic Problem Studies: Apply framework to study specific economic issues like economic inequality and asymmetric labor productivity changes
AI Tool Impact: Study economic and financial consequences of AI tool proliferation in workplaces
This paper cites 60 related references covering important works in macroeconomics, reinforcement learning, multi-agent systems, and other fields, providing solid theoretical foundations for interdisciplinary research.