2025-11-15T14:40:11.780900

The Urn of Hill, Lane and Sudderth

Franchini

We review some facts, properties and applications of the urn of Hill, Lane and Sudderth, a paradigmatic model of stochastic process with memory where the urn evolution is as follows: consider an urn of given capacity, at each step a new ball, black or white, is added to the urn with probability that is function (urn function) of the fraction of black balls. The process runs until capacity is reached.

academic

The Urn of Hill, Lane and Sudderth

Basic Information

Paper ID: 2506.20826
Title: The Urn of Hill, Lane and Sudderth
Author: Simone Franchini (Sapienza Università di Roma)
Classification: math.PR (Probability Theory)
Publication Date: November 12, 2025 (arXiv v2)
Paper Link: https://arxiv.org/abs/2506.20826

Abstract

This paper provides a systematic review of the properties and applications of the Hill, Lane, and Sudderth (HLS) urn model. This is a paradigmatic model of a stochastic process with memory: an urn of given capacity receives at each step either a black or white ball, with probability determined by the proportion of black balls (the urn function), and the process continues until capacity is reached.

Research Background and Motivation

1. Research Problem

The HLS urn model is a central tool for studying path-dependent stochastic processes, used to describe dynamic systems with reinforcement effects. The model was independently discovered by three groups of researchers in the 1980s:

Hill, Lane, and Sudderth (1980)
Blum and Brennan (1980)
Arthur, Ermoliev, and Kaniovski (1983)

2. Problem Significance

The model possesses broad interdisciplinary applications:

Mathematics:

Stochastic approximation theory
Large deviations theory
Lattice field theory

Social Sciences:

Arthur's Increasing Returns Theory
Technology lock-in phenomena
Social influence processes

Physics and Biology:

Self-avoiding walk problems
Neuronal polarization models
Wiener sausage problem

3. Limitations of Existing Research

Although basic convergence properties of the HLS model have been studied, the following questions remain incompletely resolved:

Exact integral evaluation of moment generating functions for nonlinear urn functions
Analytical solutions to nonlinear differential equations for entropy density
Complete large deviations principles under thermodynamic limits

4. Research Motivation

This paper aims to provide a unified review of the HLS model, with particular focus on:

Scaling behavior under thermodynamic limits
Large deviations principles established through lattice field theory framework
Methods for reconstructing urn functions from empirical trajectories

Core Contributions

Systematic Review: Integration of fundamental properties, convergence theorems, and application scenarios of the HLS model
Thermodynamic Limit Theory:
- Establishment of continuous embedding framework
- Derivation of explicit solutions for zero-cost trajectories
- Provision of methods for reconstructing urn functions from trajectories
Lattice Field Theory Formulation:
- Reconstruction of HLS processes as path integral forms
- Establishment of scaling limits for action
- Proof of sample path large deviations principles via Varadhan's lemma and Mogulskii's theorem
Nonlinear Equations:
- Nonlinear equation for moment generating functions (Eq. 38)
- Nonlinear differential equation for entropy density (Eq. 42)
Application Demonstrations:
- Mathematical characterization of Arthur's Increasing Returns Theory
- Urn function reconstruction from actual experimental data (van de Rijt 2019 experiment)

Detailed Methodology

Task Definition

Input:

Urn capacity $T$
Urn function $\pi: [0,1] \to [0,1]$
Initial conditions $(\psi_0, \tau_0)$

Process: At step $n$ , when the proportion of black balls is $\psi_n$ , add a black ball with probability $\pi(\psi_n)$ and a white ball with probability $1-\pi(\psi_n)$

Output:

Complete history $\sigma = \{\sigma_n \in \{0,1\}: n \in S\}$
Terminal distribution $P(\psi_T = x)$
Typical trajectory $\psi(\tau)$

Model Architecture

1. Basic Notation System

Urn History: $\sigma := \{\sigma_n \in \Omega : n \in S\} \in \Omega^S$ where $\Omega = \{0,1\}$ , $S = \{1 \leq n \leq T\}$

Black Ball Proportion (Urn Share): $\psi_n := \frac{1}{n}\sum_{n' \leq n} \sigma_{n'}$

Normalized Total Black Balls: $\phi_n := \frac{1}{T}\sum_{n' \leq n} \sigma_{n'}$

Transition Matrix: $P(\sigma_{n+1} = k | \psi_n) = \pi(\psi_n)\mathbb{I}(k=1) + (1-\pi(\psi_n))\mathbb{I}(k=0)$

2. Stochastic Approximation Equation

From $E(\sigma_{n+1}|\psi_n) = \pi(\psi_n)$ and the identity: $\sigma_{n+1} = \psi_n + (n+1)(\psi_{n+1} - \psi_n)$

we derive the core equation: $E(\psi_{n+1} - \psi_n | \psi_n) = \frac{\pi(\psi_n) - \psi_n}{n+1}$

3. Convergence Analysis

The process converges to the set: $C := \{\psi \in [0,1]: \pi(\psi) = \psi\}$

Stability Conditions:

Stable Fixed Points: $\pi$ crosses the diagonal from above (downcrossing)
Unstable Fixed Points: $\pi$ crosses the diagonal from below (upcrossing)

Thermodynamic Limit Theory

1. Continuous Embedding

Define urn saturation: $\tau_n := n/T$

Scaling limit: $\lim_{T\to\infty} \tau_n =: \tau \in [0,1]$ $\lim_{T\to\infty} \psi_n =: \psi(\tau)$

Trajectory space: $Q := \{\phi \in C([0,1]): \partial_\tau \phi(\tau) \in [0,1], \phi(0) = 0\}$

2. Zero-Cost Trajectories

In the scaling limit, substituting $E(\sigma_{n+1}|\psi_n) \to \partial_\tau \phi(\tau)$ yields the homogeneous differential equation: $\partial_\tau \phi(\tau) = \pi(\psi(\tau))$

Converting to the $\psi$ variable and incorporating initial conditions gives the Cauchy problem: $\partial_\tau \psi(\tau) = \frac{\pi(\psi(\tau)) - \psi(\tau)}{\tau}, \quad \psi(\tau_0) = \psi_0$

Analytical Solution: Introducing the transformed urn function $\Pi(\alpha) := \int \frac{d\alpha}{\pi(\alpha) - \alpha}$

the solution is: $\psi(\tau) = \Pi^{-1}(\Pi(\psi_0) + \log(\tau))$

Terminal point formula: $\psi(1) = \Pi^{-1}(\Pi(\psi_0) - \log(\tau_0))$

3. Urn Function Reconstruction

Core Idea: Reconstruct the urn function from empirical trajectories $\{\tau_n, \psi_n\}$

Fundamental equation: $\Pi(\psi) - \Pi_0^* = \log\tau(\psi)$

From trajectory data we obtain: $\pi(\psi) = \psi + \tau(\psi)\left(\frac{d\tau(\psi)}{d\psi}\right)^{-1}$

This provides a direct method for estimating the urn function from experimental data.

Lattice Field Theory Formulation

1. Path Integral Form

Ensemble average of any observable: $E(O(\sigma)) = \sum_{\sigma \in \Omega^S} O(\sigma) \frac{\exp(A(\sigma))}{\sum_{\sigma' \in \Omega^S} \exp(A(\sigma'))}$

Action: $A(\sigma) := \sum_{n \in S} L(\sigma_n, \psi_n)$

Lagrangian: $L(\sigma_n, \psi_n) = \sigma_n \log\pi(\psi_n) + (1-\sigma_n)\log(1-\pi(\psi_n))$

2. Scaling Limit

Scaled action: $\Phi(\phi) := \int_0^1 d\tau \, \mathcal{L}(\partial_\tau \phi(\tau), \pi(\psi(\tau)))$

Scale-invariant function: $\mathcal{L}(\alpha, \beta) := \alpha\log\beta + (1-\alpha)\log(1-\beta)$

3. Large Deviations Principle

Entropy Density: $\varphi(E^*) := \lim_{T\to\infty} \frac{1}{T}\log P(\sigma \in E)$

Variational Representation: $\varphi(E^*) = \inf_{\phi \in Q(E^*)} \{\Phi(\phi) - \Phi_0^*(\phi)\}$

where $\Phi_0^*$ is the Mogulskii rate function for i.i.d. processes: $\Phi_0^*(\phi) := \int_0^1 d\tau \, \mathcal{L}(\partial_\tau \phi(\tau), \partial_\tau \phi(\tau))$

4. Proof Framework

Measure Transformation: Convert from HLS measure to i.i.d. measure
Varadhan's Lemma: Establish relationship between action and entropy density
Mogulskii's Theorem: Determine rate function for i.i.d. processes

Technical Innovations

Unified Framework: Connect HLS model with lattice field theory, providing unified mathematical language
Explicit Solutions: Provide closed-form solutions for zero-cost trajectories via transformed urn function $\Pi$
Inverse Problem Method: Reconstruct urn functions from empirical trajectories, connecting microscopic rules with macroscopic dynamics
Nonlinear Equations:
- Moment generating function equation: $\pi(\partial_\beta \zeta(\beta)) = \frac{\exp(\zeta(\beta))-1}{\exp(\beta)-1}$
- Entropy density equation: $\pi(x) = \frac{\exp(x\partial_x\varphi(x)-\varphi(x))-1}{\exp(x)-1}$
Time-Dependent Lagrangian: Due to $\psi_n$ being an average rather than a sum, the Lagrangian explicitly depends on "time" $\tau$

Experimental Setup

This paper is primarily a theoretical review, but demonstrates multiple application cases:

Case 1: Arthur's Increasing Returns Theory (IRT)

Model Description:

Two competing products
Each new customer consults an odd number (at least 3) of previous customers
Selects the product chosen by the majority in the sample

Mathematical Characterization: The model can be reduced to an HLS model with a specific urn function form (see Figure 5)

Theoretical Predictions:

Almost surely reaches monopoly (some product's share → 1)
Path dependence: initial conditions determine the winner
Lock-in phenomenon

Experimental Design:

Participants answer questions, see statistics of previous answers
Two experimental groups:
- Left panel: 530 people, initial counts both zero
- Right panel: 3500 people, option A artificially advantaged (110 vs 10, $\psi_0 \approx 91.5\%$ , $\tau_0 \approx 3.4\%$ )

Observed Results (Figure 8):

Left panel: trajectories highly degenerate, multiple questions converge to different endpoints
Right panel: late start eliminates degeneracy, trajectories more concentrated

Theoretical Explanation: From the formula $\psi(1) = \Pi^{-1}(\Pi(\psi_0) - \log\tau_0)$ we see:

$\tau_0 \to 0$ (microscopic start): $\log\tau_0 \to -\infty$ , endpoint extremely sensitive to initial conditions
$\tau_0 > 0$ (macroscopic start): endpoint clearly determined by initial conditions

Case 3: Gelastopoulos et al. Experiment (2024)

Figure 9 displays urn functions reconstructed from actual experimental data, validating the effectiveness of the inverse problem method in Section 2.4.

Experimental Results

Main Theoretical Results

Strong Convergence Theorem:
- Process converges to fixed point set $C = \{\psi: \pi(\psi) = \psi\}$
- Only downcrossing points are stable
Zero-Cost Trajectories:
- Explicit solution: $\psi(\tau) = \Pi^{-1}(\Pi(\psi_0) + \log(\tau))$
- For any $\tau_0 > 0$ , scaling limit is non-degenerate
Large Deviations Principle:
- Rate function: $I(\phi) = \Phi(\phi) - \Phi_0^*(\phi)$
- Satisfies complete sample path LDP

Application Verification

IRT Model (Figure 5):

Theoretical trajectories match simulation data from Dosi et al. 2018
Successfully predicts monopoly phenomenon

Social Influence Experiment (Figure 8):

Quantitative explanation of initial condition effects
Mechanism of degeneracy elimination by late start is clear

Urn Function Reconstruction (Figure 9):

Successfully estimate urn function from experimental data
Validates practical utility of inverse problem method

Theoretical Findings

Critical Role of Saturation:
- $\tau_0 = 0$ : Complete degeneracy, initial conditions cannot predict endpoint
- $\tau_0 > 0$ : Degeneracy eliminated, trajectory determined
Time Dependence:
- HLS model's Lagrangian explicitly depends on $\tau$
- Key distinction from standard lattice field theory
Unresolved Problems:
- Exact solutions to nonlinear equations (38) and (42)
- Currently rely only on perturbation theory and numerical methods

1. Urn Model Family

Linear Urns:

Friedman urn
Bagchi-Pal model
Elephant Random Walk

Nonlinear Urns:

Arthur's IRT model
Attachment models
KKGW model

2. Mathematical Theory

Stochastic Approximation:

Pemantle (2007): Survey of reinforced random processes
Gouet (1993): Martingale functional central limit theorem

Large Deviations Theory:

Dembo & Zeitouni (1998): Foundational theory
Bryc et al. (2009): Large deviations for random trees
Franchini (2017): Large deviations for general urn functions

Analytic Combinatorics:

Flajolet et al. (2005, 2006): Analytic urns
Morcrette & Mahmoud (2012): Exactly solvable models

3. Physical Applications

Lattice Field Theory:

Jack (2019, 2020): Growing cluster models
Klymko et al. (2017, 2018): Trajectory umbrella sampling

Statistical Physics:

Self-avoiding walk problem
Wiener sausage problem
Rosenstock capture model

4. Interdisciplinary Applications

Economics:

Arthur (1989, 1994): Path dependence and lock-in
Dosi et al. (1994, 2018): Technology dynamics
Gottfried & Grosskinsky (2024): Wages and capital returns

Social Sciences:

van de Rijt (2019): Self-correcting dynamics of social influence
Gelastopoulos et al. (2024): Marginal majority effect

Biology:

Khanin & Khanin (2001): Neuronal polarization

Conclusions and Discussion

Main Conclusions

HLS model is a paradigmatic model for path-dependent stochastic processes, unifying important models across multiple fields
Complete theory under thermodynamic limits:
- Explicit solutions for zero-cost trajectories
- Sample path large deviations principles
- Lattice field theory formulation
Inverse Problem Method: Reconstruct urn functions from empirical trajectories, connecting theory with experiments
Challenge of Nonlinear Equations: Moment generating function and entropy density equations still require exact solutions

Limitations

Absence of Analytical Solutions:
- Equations (38) and (42) can be exactly solved only in linear cases
- Nonlinear cases rely on perturbation theory and numerical methods
Theoretical Assumptions:
- Urn function requires Hölder continuity
- Fixed point set $C$ must be finite isolated points
Experimental Verification:
- Primarily relies on others' experimental data
- Lacks systematic experimental design guidance
Computational Complexity:
- Computation of transformation function $\Pi$ may involve singular integrals
- Numerical stability of inverse problem not sufficiently discussed

Future Directions

Analytical Progress:
- Seek exact solutions for special classes of urn functions
- Develop systematic perturbation expansion methods
Numerical Methods:
- Efficient numerical integration algorithms
- Robust estimation methods for inverse problems
Application Extensions:
- Multi-color urn models
- Time-dependent urn functions
- Urn models on networks
Experimental Design:
- Theory-based optimal experimental design
- Active learning for urn functions

In-Depth Evaluation

Strengths

Theoretical Completeness:
- Complete derivation from basic definitions to large deviations principles
- Lattice field theory framework provides unified language
- Existence and uniqueness of explicit solutions
Interdisciplinary Vision:
- Connects probability theory, statistical physics, economics, social sciences
- Demonstrates broad applicability of the model
- Rich practical application cases
Methodological Innovation:
- Inverse problem method is novel and practical
- Introduction of transformed urn function $\Pi$ is elegant
- Interpretation of saturation $\tau$ as "time" is profound
Clear Writing:
- Consistent notation system
- Detailed derivation steps
- Intuitive and effective figures
Theory-Experiment Integration:
- Quantitative explanation of van de Rijt experiment is convincing
- Degeneracy elimination phenomenon in Figure 8 theoretically predicted accurately

Weaknesses

Prominent Unresolved Problems:
- Core nonlinear equations lack analytical solutions
- Limits theoretical completeness and practical utility
Insufficient Numerical Methods:
- Lack of concrete numerical algorithm descriptions
- Error analysis and stability of inverse problem not discussed
- No reproducible code provided
Limited Experimental Verification:
- Primarily relies on literature data
- Lacks original experimental design
- Statistical tests for model fitting insufficient
Technical Details:
- Technical conditions for continuous embedding (Hölder continuity) insufficiently discussed
- Verification conditions for Varadhan's lemma (continuity) glossed over
- Rigorous treatment of boundary cases ( $\tau_0 = 0$ ) missing
Application Guidance:
- Lacks guidance for practitioners on urn function selection
- Statistical methods for model parameter estimation incomplete
- Quantitative assessment of prediction accuracy missing

Impact

Academic Contribution:
- Provides authoritative review of HLS model
- Lattice field theory formulation opens new research directions
- Inverse problem method has methodological value
Practical Value:
- Theoretical foundation for social science experimental design
- Modeling technology adoption and market dynamics
- Applications to neuroscience and biological processes
Reproducibility:
- Theoretical derivations detailed and reproducible
- But lacks code and data
- Numerical implementation requires reader development
Research Inspiration:
- Nonlinear equation solving is clear open problem
- Multi-color extension has clear path
- Network version worth exploring

Applicable Scenarios

Theoretical Research:
- Stochastic process theory
- Large deviations theory
- Lattice field theory applications
Social Sciences:
- Social influence and conformity behavior
- Technology adoption and innovation diffusion
- Market share competition
Economics:
- Increasing returns and path dependence
- Lock-in effects and standards competition
- Network effects
Biological Systems:
- Cell polarization
- Collective decision-making
- Evolutionary dynamics
Physical Applications:
- Growth processes
- Aggregation models
- Self-organization phenomena

Selected References

Foundational Literature:

Hill, Lane, Sudderth (1980): A strong law for some generalized urn processes
Arthur, Ermoliev, Kaniovski (1983): A generalized urn problem and its applications
Franchini (2017): Large deviations for generalized Polya urns with arbitrary urn function

Theoretical Tools: 4. Dembo & Zeitouni (1998): Large Deviations Techniques and Applications 5. Pemantle (2007): A survey of random processes with reinforcement

Application Cases: 6. Arthur (1989, 1994): Increasing Returns and Path Dependence 7. van de Rijt (2019): Self-correcting dynamics in social influence processes 8. Gelastopoulos et al. (2024): The marginal majority effect

Overall Assessment: This is a high-quality review paper providing a complete theoretical framework for the HLS urn model from fundamentals to frontiers. The lattice field theory formulation and inverse problem method are important innovations, and interdisciplinary applications demonstrate the model's broad value. The main limitation is the lack of analytical solutions to core nonlinear equations, with numerical methods and experimental verification needing strengthening. For researchers in probability theory, statistical physics, and interdisciplinary studies, this is essential reading.