Regression discontinuity aggregation, with an application to the union effects on inequality
Borusyak, Kolerman-Shemer
We extend the regression discontinuity (RD) design to settings where each unit's treatment status is an average or aggregate across multiple discontinuity events. Such situations arise in many studies where the outcome is measured at a higher level of spatial or temporal aggregation (e.g., by state with district-level discontinuities) or when spillovers from discontinuity events are of interest. We propose two novel estimation procedures - one at the level at which the outcome is measured and the other in the sample of discontinuities - and show that both identify a local average causal effect under continuity assumptions similar to those of standard RD designs. We apply these ideas to study the effect of unionization on inequality in the United States. Using credible variation from close unionization elections at the establishment level, we show that a higher rate of newly unionized workers in a state-by-industry cell reduces wage inequality within the cell.
academic
Regression discontinuity aggregation, with an application to the union effects on inequality
This paper extends regression discontinuity (RD) design to settings where the treatment status for each unit is an average or aggregation of multiple discontinuity events. This situation arises in many studies where outcomes are measured at higher levels of spatial or temporal aggregation (e.g., state versus district-level discontinuities), or when spillover effects from discontinuity events are of concern. The authors propose two new estimation procedures—one at the level where outcomes are measured and another within the discontinuity sample—and demonstrate that both identify local average causal effects under continuity assumptions similar to standard RD designs. By applying these ideas to study the effects of unionization on inequality in the United States, leveraging credible variation from union elections at the establishment level, the authors show that increases in the share of newly unionized workers in state-industry units reduce wage inequality within those units.
Traditional regression discontinuity design (RD) requires that each unit is exposed to only a single discontinuity event. However, in many empirical studies, outcome variables are defined at higher levels of aggregation than the discontinuity events. For example:
Legislative Studies: State-level outcomes depend on election results from multiple single-member districts
Temporal Aggregation: Units are exposed to multiple RD events across multiple periods
Spillover Effects: Each unit is exposed to multiple elections of neighbors
Such settings are extremely common in empirical research, spanning political economy, labor economics, public finance, and other fields. Existing literature typically employs ad hoc approaches to handle these situations, lacking a unified theoretical framework and optimal estimation methods.
Consider N upper-level units i, each containing Ji lower-level sub-units j. Sub-unit j is characterized by a running variable rj and treatment zj = 1rj ≥ 0. The goal is to estimate the causal model:
Yi = βXi + εi
where Xi is the upper-level treatment variable, typically defined as:
Proposition 1 establishes the numerical equivalence of the upper-level and lower-level estimators: the upper-level IV estimator equals a specific sub-unit level fuzzy RD estimator.
Monte Carlo simulations show that the estimator including aggregated local linear control variables inherits the bias reduction properties of traditional RD methods.
Unionization significantly increases pension coverage: each new union member corresponds to an increase of 1.48 pension holders, indicating substantial inter-establishment spillover effects.
This paper extends standard RD design, distinguishing itself from multi-score RD designs in that multi-score RD addresses multiple running variables at a single boundary, while RDA addresses aggregated RD shocks.
Provides a new causal identification strategy for the effects of unions on inequality, complementing research such as Farber et al. (2021) based on observable selection.
This paper cites important literature in econometrics, labor economics, and political economy, particularly:
Borusyak et al. (2022) on shift-share instrumental variables
Frandsen (2021) on RD design with union elections
Farber et al. (2021) on unions and inequality
Overall Assessment: This is a high-quality econometric methodology paper that not only provides important theoretical contributions but also demonstrates the value of the method through meaningful empirical applications. The RDA framework fills a literature gap and provides more appropriate identification strategies for many economic studies.