2025-11-15T14:40:11.780900

The Urn of Hill, Lane and Sudderth

Franchini
We review some facts, properties and applications of the urn of Hill, Lane and Sudderth, a paradigmatic model of stochastic process with memory where the urn evolution is as follows: consider an urn of given capacity, at each step a new ball, black or white, is added to the urn with probability that is function (urn function) of the fraction of black balls. The process runs until capacity is reached.
academic

The Urn of Hill, Lane and Sudderth

Basic Information

  • Paper ID: 2506.20826
  • Title: The Urn of Hill, Lane and Sudderth
  • Author: Simone Franchini (Sapienza Università di Roma)
  • Classification: math.PR (Probability Theory)
  • Publication Date: November 12, 2025 (arXiv v2)
  • Paper Link: https://arxiv.org/abs/2506.20826

Abstract

This paper provides a systematic review of the properties and applications of the Hill, Lane, and Sudderth (HLS) urn model. This is a paradigmatic model of a stochastic process with memory: an urn of given capacity receives at each step either a black or white ball, with probability determined by the proportion of black balls (the urn function), and the process continues until capacity is reached.

Research Background and Motivation

1. Research Problem

The HLS urn model is a central tool for studying path-dependent stochastic processes, used to describe dynamic systems with reinforcement effects. The model was independently discovered by three groups of researchers in the 1980s:

  • Hill, Lane, and Sudderth (1980)
  • Blum and Brennan (1980)
  • Arthur, Ermoliev, and Kaniovski (1983)

2. Problem Significance

The model possesses broad interdisciplinary applications:

Mathematics:

  • Stochastic approximation theory
  • Large deviations theory
  • Lattice field theory

Social Sciences:

  • Arthur's Increasing Returns Theory
  • Technology lock-in phenomena
  • Social influence processes

Physics and Biology:

  • Self-avoiding walk problems
  • Neuronal polarization models
  • Wiener sausage problem

3. Limitations of Existing Research

Although basic convergence properties of the HLS model have been studied, the following questions remain incompletely resolved:

  • Exact integral evaluation of moment generating functions for nonlinear urn functions
  • Analytical solutions to nonlinear differential equations for entropy density
  • Complete large deviations principles under thermodynamic limits

4. Research Motivation

This paper aims to provide a unified review of the HLS model, with particular focus on:

  • Scaling behavior under thermodynamic limits
  • Large deviations principles established through lattice field theory framework
  • Methods for reconstructing urn functions from empirical trajectories

Core Contributions

  1. Systematic Review: Integration of fundamental properties, convergence theorems, and application scenarios of the HLS model
  2. Thermodynamic Limit Theory:
    • Establishment of continuous embedding framework
    • Derivation of explicit solutions for zero-cost trajectories
    • Provision of methods for reconstructing urn functions from trajectories
  3. Lattice Field Theory Formulation:
    • Reconstruction of HLS processes as path integral forms
    • Establishment of scaling limits for action
    • Proof of sample path large deviations principles via Varadhan's lemma and Mogulskii's theorem
  4. Nonlinear Equations:
    • Nonlinear equation for moment generating functions (Eq. 38)
    • Nonlinear differential equation for entropy density (Eq. 42)
  5. Application Demonstrations:
    • Mathematical characterization of Arthur's Increasing Returns Theory
    • Urn function reconstruction from actual experimental data (van de Rijt 2019 experiment)

Detailed Methodology

Task Definition

Input:

  • Urn capacity TT
  • Urn function π:[0,1][0,1]\pi: [0,1] \to [0,1]
  • Initial conditions (ψ0,τ0)(\psi_0, \tau_0)

Process: At step nn, when the proportion of black balls is ψn\psi_n, add a black ball with probability π(ψn)\pi(\psi_n) and a white ball with probability 1π(ψn)1-\pi(\psi_n)

Output:

  • Complete history σ={σn{0,1}:nS}\sigma = \{\sigma_n \in \{0,1\}: n \in S\}
  • Terminal distribution P(ψT=x)P(\psi_T = x)
  • Typical trajectory ψ(τ)\psi(\tau)

Model Architecture

1. Basic Notation System

Urn History: σ:={σnΩ:nS}ΩS\sigma := \{\sigma_n \in \Omega : n \in S\} \in \Omega^S where Ω={0,1}\Omega = \{0,1\}, S={1nT}S = \{1 \leq n \leq T\}

Black Ball Proportion (Urn Share): ψn:=1nnnσn\psi_n := \frac{1}{n}\sum_{n' \leq n} \sigma_{n'}

Normalized Total Black Balls: ϕn:=1Tnnσn\phi_n := \frac{1}{T}\sum_{n' \leq n} \sigma_{n'}

Transition Matrix: P(σn+1=kψn)=π(ψn)I(k=1)+(1π(ψn))I(k=0)P(\sigma_{n+1} = k | \psi_n) = \pi(\psi_n)\mathbb{I}(k=1) + (1-\pi(\psi_n))\mathbb{I}(k=0)

2. Stochastic Approximation Equation

From E(σn+1ψn)=π(ψn)E(\sigma_{n+1}|\psi_n) = \pi(\psi_n) and the identity: σn+1=ψn+(n+1)(ψn+1ψn)\sigma_{n+1} = \psi_n + (n+1)(\psi_{n+1} - \psi_n)

we derive the core equation: E(ψn+1ψnψn)=π(ψn)ψnn+1E(\psi_{n+1} - \psi_n | \psi_n) = \frac{\pi(\psi_n) - \psi_n}{n+1}

3. Convergence Analysis

The process converges to the set: C:={ψ[0,1]:π(ψ)=ψ}C := \{\psi \in [0,1]: \pi(\psi) = \psi\}

Stability Conditions:

  • Stable Fixed Points: π\pi crosses the diagonal from above (downcrossing)
  • Unstable Fixed Points: π\pi crosses the diagonal from below (upcrossing)

Thermodynamic Limit Theory

1. Continuous Embedding

Define urn saturation: τn:=n/T\tau_n := n/T

Scaling limit: limTτn=:τ[0,1]\lim_{T\to\infty} \tau_n =: \tau \in [0,1]limTψn=:ψ(τ)\lim_{T\to\infty} \psi_n =: \psi(\tau)

Trajectory space: Q:={ϕC([0,1]):τϕ(τ)[0,1],ϕ(0)=0}Q := \{\phi \in C([0,1]): \partial_\tau \phi(\tau) \in [0,1], \phi(0) = 0\}

2. Zero-Cost Trajectories

In the scaling limit, substituting E(σn+1ψn)τϕ(τ)E(\sigma_{n+1}|\psi_n) \to \partial_\tau \phi(\tau) yields the homogeneous differential equation: τϕ(τ)=π(ψ(τ))\partial_\tau \phi(\tau) = \pi(\psi(\tau))

Converting to the ψ\psi variable and incorporating initial conditions gives the Cauchy problem: τψ(τ)=π(ψ(τ))ψ(τ)τ,ψ(τ0)=ψ0\partial_\tau \psi(\tau) = \frac{\pi(\psi(\tau)) - \psi(\tau)}{\tau}, \quad \psi(\tau_0) = \psi_0

Analytical Solution: Introducing the transformed urn function Π(α):=dαπ(α)α\Pi(\alpha) := \int \frac{d\alpha}{\pi(\alpha) - \alpha}

the solution is: ψ(τ)=Π1(Π(ψ0)+log(τ))\psi(\tau) = \Pi^{-1}(\Pi(\psi_0) + \log(\tau))

Terminal point formula: ψ(1)=Π1(Π(ψ0)log(τ0))\psi(1) = \Pi^{-1}(\Pi(\psi_0) - \log(\tau_0))

3. Urn Function Reconstruction

Core Idea: Reconstruct the urn function from empirical trajectories {τn,ψn}\{\tau_n, \psi_n\}

Fundamental equation: Π(ψ)Π0=logτ(ψ)\Pi(\psi) - \Pi_0^* = \log\tau(\psi)

From trajectory data we obtain: π(ψ)=ψ+τ(ψ)(dτ(ψ)dψ)1\pi(\psi) = \psi + \tau(\psi)\left(\frac{d\tau(\psi)}{d\psi}\right)^{-1}

This provides a direct method for estimating the urn function from experimental data.

Lattice Field Theory Formulation

1. Path Integral Form

Ensemble average of any observable: E(O(σ))=σΩSO(σ)exp(A(σ))σΩSexp(A(σ))E(O(\sigma)) = \sum_{\sigma \in \Omega^S} O(\sigma) \frac{\exp(A(\sigma))}{\sum_{\sigma' \in \Omega^S} \exp(A(\sigma'))}

Action: A(σ):=nSL(σn,ψn)A(\sigma) := \sum_{n \in S} L(\sigma_n, \psi_n)

Lagrangian: L(σn,ψn)=σnlogπ(ψn)+(1σn)log(1π(ψn))L(\sigma_n, \psi_n) = \sigma_n \log\pi(\psi_n) + (1-\sigma_n)\log(1-\pi(\psi_n))

2. Scaling Limit

Scaled action: Φ(ϕ):=01dτL(τϕ(τ),π(ψ(τ)))\Phi(\phi) := \int_0^1 d\tau \, \mathcal{L}(\partial_\tau \phi(\tau), \pi(\psi(\tau)))

Scale-invariant function: L(α,β):=αlogβ+(1α)log(1β)\mathcal{L}(\alpha, \beta) := \alpha\log\beta + (1-\alpha)\log(1-\beta)

3. Large Deviations Principle

Entropy Density: φ(E):=limT1TlogP(σE)\varphi(E^*) := \lim_{T\to\infty} \frac{1}{T}\log P(\sigma \in E)

Variational Representation: φ(E)=infϕQ(E){Φ(ϕ)Φ0(ϕ)}\varphi(E^*) = \inf_{\phi \in Q(E^*)} \{\Phi(\phi) - \Phi_0^*(\phi)\}

where Φ0\Phi_0^* is the Mogulskii rate function for i.i.d. processes: Φ0(ϕ):=01dτL(τϕ(τ),τϕ(τ))\Phi_0^*(\phi) := \int_0^1 d\tau \, \mathcal{L}(\partial_\tau \phi(\tau), \partial_\tau \phi(\tau))

4. Proof Framework

  1. Measure Transformation: Convert from HLS measure to i.i.d. measure
  2. Varadhan's Lemma: Establish relationship between action and entropy density
  3. Mogulskii's Theorem: Determine rate function for i.i.d. processes

Technical Innovations

  1. Unified Framework: Connect HLS model with lattice field theory, providing unified mathematical language
  2. Explicit Solutions: Provide closed-form solutions for zero-cost trajectories via transformed urn function Π\Pi
  3. Inverse Problem Method: Reconstruct urn functions from empirical trajectories, connecting microscopic rules with macroscopic dynamics
  4. Nonlinear Equations:
    • Moment generating function equation: π(βζ(β))=exp(ζ(β))1exp(β)1\pi(\partial_\beta \zeta(\beta)) = \frac{\exp(\zeta(\beta))-1}{\exp(\beta)-1}
    • Entropy density equation: π(x)=exp(xxφ(x)φ(x))1exp(x)1\pi(x) = \frac{\exp(x\partial_x\varphi(x)-\varphi(x))-1}{\exp(x)-1}
  5. Time-Dependent Lagrangian: Due to ψn\psi_n being an average rather than a sum, the Lagrangian explicitly depends on "time" τ\tau

Experimental Setup

This paper is primarily a theoretical review, but demonstrates multiple application cases:

Case 1: Arthur's Increasing Returns Theory (IRT)

Model Description:

  • Two competing products
  • Each new customer consults an odd number (at least 3) of previous customers
  • Selects the product chosen by the majority in the sample

Mathematical Characterization: The model can be reduced to an HLS model with a specific urn function form (see Figure 5)

Theoretical Predictions:

  • Almost surely reaches monopoly (some product's share → 1)
  • Path dependence: initial conditions determine the winner
  • Lock-in phenomenon

Case 2: van de Rijt Social Influence Experiment (2019)

Experimental Design:

  • Participants answer questions, see statistics of previous answers
  • Two experimental groups:
    • Left panel: 530 people, initial counts both zero
    • Right panel: 3500 people, option A artificially advantaged (110 vs 10, ψ091.5%\psi_0 \approx 91.5\%, τ03.4%\tau_0 \approx 3.4\%)

Observed Results (Figure 8):

  • Left panel: trajectories highly degenerate, multiple questions converge to different endpoints
  • Right panel: late start eliminates degeneracy, trajectories more concentrated

Theoretical Explanation: From the formula ψ(1)=Π1(Π(ψ0)logτ0)\psi(1) = \Pi^{-1}(\Pi(\psi_0) - \log\tau_0) we see:

  • τ00\tau_0 \to 0 (microscopic start): logτ0\log\tau_0 \to -\infty, endpoint extremely sensitive to initial conditions
  • τ0>0\tau_0 > 0 (macroscopic start): endpoint clearly determined by initial conditions

Case 3: Gelastopoulos et al. Experiment (2024)

Figure 9 displays urn functions reconstructed from actual experimental data, validating the effectiveness of the inverse problem method in Section 2.4.

Experimental Results

Main Theoretical Results

  1. Strong Convergence Theorem:
    • Process converges to fixed point set C={ψ:π(ψ)=ψ}C = \{\psi: \pi(\psi) = \psi\}
    • Only downcrossing points are stable
  2. Zero-Cost Trajectories:
    • Explicit solution: ψ(τ)=Π1(Π(ψ0)+log(τ))\psi(\tau) = \Pi^{-1}(\Pi(\psi_0) + \log(\tau))
    • For any τ0>0\tau_0 > 0, scaling limit is non-degenerate
  3. Large Deviations Principle:
    • Rate function: I(ϕ)=Φ(ϕ)Φ0(ϕ)I(\phi) = \Phi(\phi) - \Phi_0^*(\phi)
    • Satisfies complete sample path LDP

Application Verification

IRT Model (Figure 5):

  • Theoretical trajectories match simulation data from Dosi et al. 2018
  • Successfully predicts monopoly phenomenon

Social Influence Experiment (Figure 8):

  • Quantitative explanation of initial condition effects
  • Mechanism of degeneracy elimination by late start is clear

Urn Function Reconstruction (Figure 9):

  • Successfully estimate urn function from experimental data
  • Validates practical utility of inverse problem method

Theoretical Findings

  1. Critical Role of Saturation:
    • τ0=0\tau_0 = 0: Complete degeneracy, initial conditions cannot predict endpoint
    • τ0>0\tau_0 > 0: Degeneracy eliminated, trajectory determined
  2. Time Dependence:
    • HLS model's Lagrangian explicitly depends on τ\tau
    • Key distinction from standard lattice field theory
  3. Unresolved Problems:
    • Exact solutions to nonlinear equations (38) and (42)
    • Currently rely only on perturbation theory and numerical methods

1. Urn Model Family

Linear Urns:

  • Friedman urn
  • Bagchi-Pal model
  • Elephant Random Walk

Nonlinear Urns:

  • Arthur's IRT model
  • Attachment models
  • KKGW model

2. Mathematical Theory

Stochastic Approximation:

  • Pemantle (2007): Survey of reinforced random processes
  • Gouet (1993): Martingale functional central limit theorem

Large Deviations Theory:

  • Dembo & Zeitouni (1998): Foundational theory
  • Bryc et al. (2009): Large deviations for random trees
  • Franchini (2017): Large deviations for general urn functions

Analytic Combinatorics:

  • Flajolet et al. (2005, 2006): Analytic urns
  • Morcrette & Mahmoud (2012): Exactly solvable models

3. Physical Applications

Lattice Field Theory:

  • Jack (2019, 2020): Growing cluster models
  • Klymko et al. (2017, 2018): Trajectory umbrella sampling

Statistical Physics:

  • Self-avoiding walk problem
  • Wiener sausage problem
  • Rosenstock capture model

4. Interdisciplinary Applications

Economics:

  • Arthur (1989, 1994): Path dependence and lock-in
  • Dosi et al. (1994, 2018): Technology dynamics
  • Gottfried & Grosskinsky (2024): Wages and capital returns

Social Sciences:

  • van de Rijt (2019): Self-correcting dynamics of social influence
  • Gelastopoulos et al. (2024): Marginal majority effect

Biology:

  • Khanin & Khanin (2001): Neuronal polarization

Conclusions and Discussion

Main Conclusions

  1. HLS model is a paradigmatic model for path-dependent stochastic processes, unifying important models across multiple fields
  2. Complete theory under thermodynamic limits:
    • Explicit solutions for zero-cost trajectories
    • Sample path large deviations principles
    • Lattice field theory formulation
  3. Inverse Problem Method: Reconstruct urn functions from empirical trajectories, connecting theory with experiments
  4. Challenge of Nonlinear Equations: Moment generating function and entropy density equations still require exact solutions

Limitations

  1. Absence of Analytical Solutions:
    • Equations (38) and (42) can be exactly solved only in linear cases
    • Nonlinear cases rely on perturbation theory and numerical methods
  2. Theoretical Assumptions:
    • Urn function requires Hölder continuity
    • Fixed point set CC must be finite isolated points
  3. Experimental Verification:
    • Primarily relies on others' experimental data
    • Lacks systematic experimental design guidance
  4. Computational Complexity:
    • Computation of transformation function Π\Pi may involve singular integrals
    • Numerical stability of inverse problem not sufficiently discussed

Future Directions

  1. Analytical Progress:
    • Seek exact solutions for special classes of urn functions
    • Develop systematic perturbation expansion methods
  2. Numerical Methods:
    • Efficient numerical integration algorithms
    • Robust estimation methods for inverse problems
  3. Application Extensions:
    • Multi-color urn models
    • Time-dependent urn functions
    • Urn models on networks
  4. Experimental Design:
    • Theory-based optimal experimental design
    • Active learning for urn functions

In-Depth Evaluation

Strengths

  1. Theoretical Completeness:
    • Complete derivation from basic definitions to large deviations principles
    • Lattice field theory framework provides unified language
    • Existence and uniqueness of explicit solutions
  2. Interdisciplinary Vision:
    • Connects probability theory, statistical physics, economics, social sciences
    • Demonstrates broad applicability of the model
    • Rich practical application cases
  3. Methodological Innovation:
    • Inverse problem method is novel and practical
    • Introduction of transformed urn function Π\Pi is elegant
    • Interpretation of saturation τ\tau as "time" is profound
  4. Clear Writing:
    • Consistent notation system
    • Detailed derivation steps
    • Intuitive and effective figures
  5. Theory-Experiment Integration:
    • Quantitative explanation of van de Rijt experiment is convincing
    • Degeneracy elimination phenomenon in Figure 8 theoretically predicted accurately

Weaknesses

  1. Prominent Unresolved Problems:
    • Core nonlinear equations lack analytical solutions
    • Limits theoretical completeness and practical utility
  2. Insufficient Numerical Methods:
    • Lack of concrete numerical algorithm descriptions
    • Error analysis and stability of inverse problem not discussed
    • No reproducible code provided
  3. Limited Experimental Verification:
    • Primarily relies on literature data
    • Lacks original experimental design
    • Statistical tests for model fitting insufficient
  4. Technical Details:
    • Technical conditions for continuous embedding (Hölder continuity) insufficiently discussed
    • Verification conditions for Varadhan's lemma (continuity) glossed over
    • Rigorous treatment of boundary cases (τ0=0\tau_0 = 0) missing
  5. Application Guidance:
    • Lacks guidance for practitioners on urn function selection
    • Statistical methods for model parameter estimation incomplete
    • Quantitative assessment of prediction accuracy missing

Impact

  1. Academic Contribution:
    • Provides authoritative review of HLS model
    • Lattice field theory formulation opens new research directions
    • Inverse problem method has methodological value
  2. Practical Value:
    • Theoretical foundation for social science experimental design
    • Modeling technology adoption and market dynamics
    • Applications to neuroscience and biological processes
  3. Reproducibility:
    • Theoretical derivations detailed and reproducible
    • But lacks code and data
    • Numerical implementation requires reader development
  4. Research Inspiration:
    • Nonlinear equation solving is clear open problem
    • Multi-color extension has clear path
    • Network version worth exploring

Applicable Scenarios

  1. Theoretical Research:
    • Stochastic process theory
    • Large deviations theory
    • Lattice field theory applications
  2. Social Sciences:
    • Social influence and conformity behavior
    • Technology adoption and innovation diffusion
    • Market share competition
  3. Economics:
    • Increasing returns and path dependence
    • Lock-in effects and standards competition
    • Network effects
  4. Biological Systems:
    • Cell polarization
    • Collective decision-making
    • Evolutionary dynamics
  5. Physical Applications:
    • Growth processes
    • Aggregation models
    • Self-organization phenomena

Selected References

Foundational Literature:

  1. Hill, Lane, Sudderth (1980): A strong law for some generalized urn processes
  2. Arthur, Ermoliev, Kaniovski (1983): A generalized urn problem and its applications
  3. Franchini (2017): Large deviations for generalized Polya urns with arbitrary urn function

Theoretical Tools: 4. Dembo & Zeitouni (1998): Large Deviations Techniques and Applications 5. Pemantle (2007): A survey of random processes with reinforcement

Application Cases: 6. Arthur (1989, 1994): Increasing Returns and Path Dependence 7. van de Rijt (2019): Self-correcting dynamics in social influence processes 8. Gelastopoulos et al. (2024): The marginal majority effect


Overall Assessment: This is a high-quality review paper providing a complete theoretical framework for the HLS urn model from fundamentals to frontiers. The lattice field theory formulation and inverse problem method are important innovations, and interdisciplinary applications demonstrate the model's broad value. The main limitation is the lack of analytical solutions to core nonlinear equations, with numerical methods and experimental verification needing strengthening. For researchers in probability theory, statistical physics, and interdisciplinary studies, this is essential reading.