2025-11-22T21:25:17.572539

Large deviations for Generalized Polya Urns with non-binary increments

Franchini

In this paper we show how to extend the Sample-Path Large Deviation Principle for the urn model of Hill, Lane and Sudderth to the case in which the increment of the urn is not a binary variable. In particular, we sketch how to modify the Theorem 1 given in [Stochastic Processes and their Applications 127 (2017) 3372-3411] to include also urn processes with increments taking more than two values.

academic

Large deviations for Generalized Polya Urns with non-binary increments

Basic Information

Paper ID: 2506.22234
Title: Large deviations for Generalized Polya Urns with non-binary increments
Author: Simone Franchini (Sapienza Università di Roma)
Classification: math.PR (Probability Theory)
Publication Date: November 17, 2025 (arXiv v2)
Paper Link: https://arxiv.org/abs/2506.22234

Abstract

This paper extends the sample path large deviations principle for the Pólya urn model of Hill, Lane, and Sudderth (HLS) to cases where increments are not binary variables. Specifically, the paper demonstrates how to modify Theorem 1 from Stochastic Processes and their Applications 127 (2017) 3372-3411 to include urn processes where increments can take more than two values.

Research Background and Motivation

Research Problem

The classical HLS Pólya urn model is a paradigmatic stochastic process with memory, where at each step black or white balls are added to the urn with probabilities depending on the current proportion of black balls (the urn function). This model can only handle binary increments (K=1, meaning balls can only be black or white), but many practical applications require multi-valued increments (K>1).

Problem Importance

Model Universality: The HLS model has been embedded in many important models, including:
- Economics: Arthur's Increasing Returns Theory
- Physics: Range problems of random walks, Wiener sausages, self-avoiding walks
- Biology: Khanin model of neuronal polarity
- Social Science: Bagchi-Pal model, elephant random walks
Application Demands: The innovation diffusion model considered by Dosi et al. in 54 requires at least three values to describe, which exceeds the capacity of the binary framework.
Theoretical Completeness: Existing large deviations theory applies only to the binary case, limiting the applicability of the theoretical framework.

Limitations of Existing Approaches

The author's previous work 8,9 established the sample path large deviations principle (SPLDP) for the K=1 (binary) case
This theoretical framework cannot be directly extended to K>1 cases
Core concepts such as urn vectors, embedding functions, and Lagrangians need to be redefined

Research Motivation

To extend large deviations theory to non-binary increments, enabling it to:

Handle broader practical applications (such as Potts-type systems)
Provide theoretical foundations for neural network lattice field theory methods 56,57
Generate synthetic data for testing approximate theories

Core Contributions

Theoretical Extension: Generalize the sample path large deviations principle of the HLS urn model from binary increments (K=1) to arbitrary finite multi-valued increments (K≥1)
Mathematical Framework Construction:
- Introduce the urn vector concept, replacing the single urn function
- Define Kronecker function embedding via Lagrange interpolation for the multi-valued case
- Derive the general form of the scaled Lagrangian
Explicit Computation: Provide complete closed-form solutions for the K=2 (three-valued increments) case, including:
- Explicit expression of the Mogulskii Lagrangian
- The ξ function obtained by solving a cubic equation
- Complete rate function
Application Value: Provide theoretical foundations for the empirical model of Dosi et al. 54 and provide controllable benchmarks for neural LFT approximations 56,57

Detailed Methodology

Task Definition

Objective: Compute the entropy density scaling limit for a given event E: $\phi(E^*) := \lim_{N\to\infty} \frac{1}{N}\log P(\sigma \in E)$

where:

N: Total number of customers (time steps)
σ: Market history (sequence of choices at each step)
E*: Scaling limit of event E

Core Problem: Establish a variational principle to compute this limit.

Model Architecture

1. Basic Mathematical Structure

Market History Space:

Customer sequence index: $S = \{1 \leq n \leq N\}$
Increment support set: $\hat{\Omega} = \{1, 2, ..., K\}$ , $\Omega = \{0, 1, ..., K\}$
Market history: $\sigma = \{\sigma_n \in \Omega : n \in S\} \in \Omega^S$

Key Quantities:

Total sales: $M_n = \sum_{s\leq n} \sigma_s$
Average sales: $\psi_n = \frac{1}{n}\sum_{s\leq n} \sigma_s$

2. Urn Vector

This is the core innovation of the extension. Define the random kernel: $\pi = \{\pi_k(\alpha) \in [0,1] : k \in \Omega, \alpha \in [0,K]\}$

where:

$\pi_k(\alpha)$ : Probability that the increment is exactly k when the current average is α
Constraint: $\sum_{k\in\Omega} \pi_k(\alpha) = 1$
Number of independent components: K (since $\pi_0$ is determined by the others)

Average Step Size (the true urn function analogue): $\bar{\pi}(\alpha) = \sum_{k\in\hat{\Omega}} k \cdot \pi_k(\alpha)$

This determines the set of convergence points: $C = \{\alpha \in [0,K] : \bar{\pi}(\alpha) = \alpha\}$

3. Path Integral Formulation

Transition Probability: $P(\sigma_{n+1} = k | \psi_n) = \pi_k(\psi_n)$

Step Weight (using Kronecker functions): $U(\sigma_n, \psi_n) = \prod_{k\in\Omega} \pi_k(\psi_n)^{\delta_k(\sigma_n)} = \exp\sum_{k\in\Omega} \delta_k(\sigma_n)\log\pi_k(\psi_n)$

Path Weight: $W(\sigma) = \prod_{n\in S} U(\sigma_n, \psi_n)$

Action: $A(\sigma) = \sum_{n\in S} L(\sigma_n, \psi_n) = \sum_{n\in S}\sum_{k\in\Omega} \delta_k(\sigma_n)\log\pi_k(\psi_n)$

Technical Innovations

1. Continuous Embedding of Kronecker Functions

Key technical challenge: How to embed discrete Kronecker δ functions into continuous space.

Solution: Use Lagrange interpolation $\delta_k(\alpha) := \prod_{z\in\Omega\setminus\{k\}} \frac{z-\alpha}{z-k}$

Properties:

Preserves the Kronecker property for integer α
Analytic on the real domain α∈ℝ
Applicable for any finite K

Example (K=2): $\delta_0(\alpha) = (1-\alpha)(1-\alpha/2)$ $\delta_1(\alpha) = \alpha(2-\alpha)$ $\delta_2(\alpha) = \frac{\alpha}{2}(\alpha-1)$

2. Scaling Limit Theory

Continuous Embedding: Embed trajectories into the space of K-Lipschitz functions $Q = \{\phi \in C^1([0,1]) : \partial_\tau\phi(\tau) \in [0,K], \phi(0)=0\}$

Scaling Transformation: $\tau = \lim_{N\to\infty} n/N \in [0,1]$ $\phi(\tau) = \lim_{N\to\infty} M_n/N$ $\psi(\tau) = \phi(\tau)/\tau$

3. Scaled Lagrangian

General Form: $\mathcal{L}(\alpha, \beta) = \sum_{k\in\Omega} \delta_k(\alpha)\log\pi_k(\beta)$

Scaled Action: $\Phi(\phi) = \int_0^1 d\tau\, \mathcal{L}(\partial_\tau\phi(\tau), \psi(\tau))$

4. Application of Mogulskii Theorem

For i.i.d. processes (uniform distribution $P_0(\sigma_n=k)=1/(K+1)$ ), compute the rate function:

Moment Generating Function: $\zeta_0(\beta) = \log\frac{1-\exp((K+1)\beta)}{(K+1)(1-\exp(\beta))}$

Legendre Transform (via saddle point equation): $\alpha = \frac{\xi}{1-\xi} - (K+1)\frac{\xi^{K+1}}{1-\xi^{K+1}}$

where $\xi = \exp(\beta^*)$ .

Mogulskii Lagrangian: $\mathcal{L}_0(\alpha) = \alpha\log\xi(\alpha,K) - \log(1-\xi(\alpha,K)^{K+1}) + \log(1-\xi(\alpha,K))$

(normalized version after removing the constant log(K+1))

Main Theorem (Variational Principle)

Sample Path Large Deviations Principle: $\phi(E^*) = \inf_{\phi\in Q(E^*)} \{\Phi(\phi) - \Phi_0(\phi)\}$

where:

$\Phi(\phi)$ : Scaled action of the process
$\Phi_0(\phi)$ : Scaled action of the corresponding i.i.d. trajectory
$Q(E^*)$ : Set of trajectories corresponding to event $E^*$

Proof Strategy:

Measure transformation (convert original measure to i.i.d. measure)
Verify convergence of scaled action
Apply Varadhan's lemma
Use Mogulskii theorem for the i.i.d. part

Experimental Setup

This is a pure theoretical mathematics paper without traditional numerical experiments. However, it provides two detailed analytical verification cases:

Case 1: K=1 (Consistency Verification)

Purpose: Verify that the new framework is consistent with existing theory 8 in the binary case.

Setup:

Increment values: k∈{0,1}
Urn function: $\pi_1(\alpha)$ , $\pi_0(\alpha)=1-\pi_1(\alpha)$
i.i.d. distribution: $P_0(\sigma_n=k)=1/2$

Verification Content:

Embedded delta function: $\delta_1(\alpha)=\alpha$
Scaled Lagrangian: $\mathcal{L}(\alpha,\beta) = \alpha\log\pi_1(\beta) + (1-\alpha)\log(1-\pi_1(\beta))$
Mogulskii Lagrangian: $\mathcal{L}_0(\alpha) = \alpha\log\alpha + (1-\alpha)\log(1-\alpha)$ (ignoring constants)

Result: Completely recovers the form of Theorem 1 from 8.

Case 2: K=2 (Main New Result)

Purpose: Demonstrate the first complete analytical solution beyond the binary case.

Setup:

Increment values: k∈{0,1,2}
Urn vector: $\pi(\alpha) = \{\pi_1(\alpha), \pi_2(\alpha)\}$ (two independent components)
i.i.d. distribution: $P_0(\sigma_n=k)=1/3$

Technical Details:

Delta Functions (Equations 89-90): $\delta_0(\alpha) = (1-\alpha)(1-\alpha/2)$ $\delta_1(\alpha) = \alpha(2-\alpha)$ $\delta_2(\alpha) = \frac{\alpha}{2}(\alpha-1)$
Step Weight (Equation 93): $U(\sigma_n,\psi_n) = \pi_1(\psi_n)^{\sigma_n(2-\sigma_n)} \pi_2(\psi_n)^{\frac{\sigma_n}{2}(\sigma_n-1)} (1-\pi_1-\pi_2)^{(1-\sigma_n)(1-\frac{\sigma_n}{2})}$
Scaled Lagrangian (Equation 94): $\mathcal{L}(\alpha,\beta) = \alpha(2-\alpha)\log\pi_1(\beta) + \frac{\alpha}{2}(\alpha-1)\log\pi_2(\beta) + (1-\alpha)(1-\alpha/2)\log(1-\pi_1-\pi_2)$
Solving the Cubic Equation (Equations 97-98): $\alpha = \frac{\xi}{1-\xi} - 3\frac{\xi^3}{1-\xi^3}$
Rewritten as: $(\xi-1)[(\alpha-2)\xi^2 + (\alpha-1)\xi + \alpha] = 0$
The unique physical solution ( $\xi(0,2)=0$ , $\xi(1,2)=1$ ): $\xi(\alpha,2) = \frac{(1-\alpha)-\sqrt{1+6\alpha-3\alpha^2}}{2(\alpha-2)}$
Mogulskii Lagrangian Closed-Form Solution (Equation 100): $\mathcal{L}_0(\alpha) = \alpha\log\left(\frac{(\alpha-1)+\sqrt{1+6\alpha-3\alpha^2}}{2(2-\alpha)}\right) - \log\left(\frac{(7-3\alpha)+\sqrt{1+6\alpha-3\alpha^2}}{2(2-\alpha)^2}\right)$

Experimental Results

Analytical Verification Results

K=1 Case

Consistency Check: ✓ Completely recovers results from literature 8
Delta Function: Linear form $\delta_1(\alpha)=\alpha$
Mogulskii Lagrangian: Classical binary entropy form
ξ Function: Exact solution of quadratic equation $\xi(\alpha,1)=\alpha/(1-\alpha)$

K=2 Case (Core New Result)

Delta Functions: Quadratic polynomials (Equations 89-90)
Cubic Equation Solution: Obtains explicit radical solution (Equation 99)
Mogulskii Lagrangian: Complete closed-form expression (Equation 100)
Complexity: Involves radicals but remains an elementary function

Theoretical Property Verification

Boundary Conditions:
- $\xi(0,K)=0$ ✓
- $\xi(K,K)=1$ ✓ (verified for K=1,2)
Monotonicity: ξ function is monotonically increasing on 0,K
Analyticity: All functions are analytic in their domains (Hölder continuous)
Degenerate Consistency: K=2 results degenerate to K=1 under appropriate limits

Key Findings

Solvability: K=2 case is completely solvable without numerical methods
Algebraic Complexity:
- K=1: Quadratic equation
- K=2: Cubic equation (solvable via Cardano formula)
- K≥3: Fifth-degree and higher equations (generally require numerical methods)
Physical Meaning: Produces non-trivial pure dynamical Lagrangian suitable for lattice field theory framework
Application Potential: Can be directly applied to empirical models of Dosi et al. 54 (with appropriate shifts)

Urn Model Theory

Classical Works:
- Hill, Lane, Sudderth 1,2: Foundational theory of HLS urn model
- Arthur, Ermoliev, Kaniovski 3: Generalized urn problems and applications
- Pemantle 4,18: Convergence conditions and survey of reinforced processes
Large Deviations Theory:
- Dembo & Zeitouni 7: Standard reference for large deviations techniques
- Franchini 8,9,15: Sample path large deviations for HLS urns (K=1)
- Bryc, Minda, Sethuraman 13: Large deviations for random tree leaves
Analytical Methods:
- Flajolet et al. 10,11,12: Analytic urns and combinatorial methods
- Morcrette & Mahmoud 14: Solvable urns via analytic methodology

Application Domains

Economics:
- Arthur 29,32,36: Increasing returns theory and path dependence
- Dosi et al. 37,54: Technology dynamics and innovation diffusion
- Gottfried & Grosskinsky 30,40,41: Nonlinear feedback and wage-capital models
Physics:
- Jack et al. 27,44,45,46,47: Large deviations and ergodicity of growth processes
- Franchini & Balzan 49,52: Random polymers and self-avoiding walks
- Nakayama & Mori 6: Non-equilibrium phase transitions
Biology:
- Khanin & Khanin 48: Modeling neuronal polarity establishment
Random Walks:
- Schütz & Trimper 21: Elephant random walks
- Baur & Bertoin 22: Connection between ERW and Pólya urns
- Gut & Stadtmüller 23: Variants of ERW

Advantages of This Work

Theoretical Completeness: First extension of SPLDP to K>1, filling a theoretical gap
Explicit Computability: Provides complete closed-form solution for K=2, unlike pure existence results
Methodological Innovation: Lagrange embedding technique for Kronecker functions has universal applicability
Application-Oriented: Directly addresses empirical needs 54, not merely mathematical generalization
Lattice Field Theory Connection: Provides theoretical benchmark for neural LFT methods 56,57

Conclusions and Discussion

Main Conclusions

Successful Theory Extension: The sample path large deviations principle for the HLS urn model can be generalized to non-binary increments with arbitrary finite K values
Variational Principle Established: The entropy density scaling limit is given by the variational problem: $\phi(E^*) = \inf_{\phi\in Q(E^*)} \{\Phi(\phi) - \Phi_0(\phi)\}$
Explicit Solutions Exist: The K=2 case yields complete closed-form solutions, including:
- Radical solution of cubic equation
- Elementary function expression of Mogulskii Lagrangian
- Complete rate function
Methodological Contributions:
- Urn vector concept replaces single urn function
- Lagrange interpolation embeds Kronecker functions
- Minimal modification of standard large deviations techniques

Limitations

Proof Completeness:
- Paper adopts "sketch" style without complete rigorous proofs
- Convergence verification (Equations 61-62) not fully detailed
- Sufficiency of continuity conditions not completely argued
Solvability Restrictions:
- For K≥3, requires solving fifth-degree and higher equations
- General cases may require numerical methods for ξ function
- Computational complexity grows rapidly with K
Practical Applications:
- No numerical examples of specific models provided
- Lacks comparison with empirical data
- Numerical solution methods for variational problems not discussed
Theoretical Depth:
- Properties of rate function (convexity, uniqueness) not discussed
- Characteristics of optimal trajectories not deeply analyzed
- Relationship with other large deviations principles (e.g., Freidlin-Wentzell) not clarified
Generalization Directions:
- Only handles finite K, infinite K cases not addressed
- Time-dependent urn functions not considered
- Multidimensional urn process generalizations not explored

Future Directions

Theory Refinement:
- Provide complete rigorous proofs
- Analyze mathematical properties of rate functions
- Study limiting behavior as K→∞
Computational Methods:
- Develop efficient numerical solvers for variational problems
- Study numerical algorithms for ξ function when K≥3
- Implement practical trajectory optimization tools
Application Extensions:
- Apply theory to empirical data from Dosi et al. 54
- Provide benchmarks for neural LFT 56,57
- Explore specific models in other disciplines
Model Generalizations:
- Extend to continuous increments (K→∞)
- Consider time-dependent urn vectors
- Study multidimensional and coupled urn systems

In-Depth Evaluation

Strengths

1. Theoretical Innovation ★★★★★

Important Theoretical Breakthrough: First generalization of mature K=1 theory to K>1, not a trivial extension
Clever Technique: Lagrange interpolation embedding of Kronecker functions is elegant and simple
Complete Framework: Logical chain from definitions to theorems is complete
Natural Concept: Introduced urn vector concept is natural and necessary

2. Mathematical Rigor ★★★★☆

Clear Symbol System: Distinguishes $\Omega$ and $\hat{\Omega}$ , $\sigma$ and $\phi$ carefully
Explicit Limit Process: Scaling limit definitions are clear (Equations 46-48)
Sufficient Verification: K=1 case verification demonstrates backward compatibility
Shortcoming: Some proofs use "sketch" style, rigor could be improved

3. Computational Feasibility ★★★★☆

K=2 Completely Solvable: Provides explicit closed-form solution (Equations 99-100)
Reasonable Algebraic Complexity: Involves radicals but remains elementary functions
Extensibility: Methodology can extend to higher K (though complexity increases)
Limitation: K≥3 may require numerical methods

4. Application Value ★★★★★

Demand-Driven: Directly addresses application needs of Dosi et al. 54
Interdisciplinary Impact: Connects probability theory, statistical physics, economics, neuroscience
Lattice Field Theory Bridge: Provides theoretical foundation for neural LFT 56,57
Synthetic Data Generation: Can be used to test approximate theories

5. Writing Quality ★★★★☆

Clear Structure: Progresses logically from basic concepts to main results
Consistent Notation: Symbols used uniformly throughout
Physical Intuition: Market history and customer analogies aid understanding
Improvable: Some mathematical derivations could be more detailed

Weaknesses

1. Proof Completeness

Main Issue: Core theorem (Equation 32) proof uses "sketch" style
Missing Links:
- Rigorous proof of convergence (Equation 61)
- Verification of continuity conditions (Equation 62)
- Complete checking of Varadhan lemma application conditions
Impact: Reduces mathematical rigor of the paper

2. Experimental Verification

Pure Theory: No numerical experiments or empirical data verification
Missing Cases: No demonstration of trajectory computation under specific urn functions
Insufficient Visualization: No plots showing rate functions or optimal trajectories
Recommendation: Should include at least one numerical example

3. Result Depth

Insufficient Property Analysis:
- Convexity of rate function not discussed
- Uniqueness of optimal trajectories not analyzed
- Phase transition behavior not explored
Missing Comparisons: No comparison with other large deviations theories (e.g., Freidlin-Wentzell)
Limited Application Guidance: How to use results in practice not sufficiently clear

4. Technical Limitations

High K Complexity: Method complexity grows rapidly for K≥3
Missing Numerical Methods: Practical solution of variational problems not discussed
Limited Generalization: Method difficult to extend to infinite K or continuous cases

5. Literature Review

Scattered Related Work: Many citations but lack systematic organization
Insufficient Comparison: Inadequate comparison with other HLS model generalizations
Unclear Historical Context: Development of large deviations theory in urn models not sufficiently clear

Impact Assessment

Contribution to Field ★★★★★

Fills Theoretical Gap: Large deviations theory for non-binary urn models previously missing
Methodological Value: Lagrange embedding technique may inspire solutions to other discrete-continuous problems
Unified Framework: Incorporates seemingly different models into unified theory
Expected Citations: Likely to become foundational literature in this direction

Practical Value ★★★★☆

Direct Application: Model of Dosi et al. 54 can immediately use results
Tool Potential: Provides new tools for analyzing complex systems
Neural LFT Benchmark: Can serve as benchmark for machine learning methods
Limitation: Requires further tool development for widespread application

Reproducibility ★★★★★

Clear Symbols: All definitions are clear and unambiguous
Complete Formulas: Key formulas (94, 99, 100) can be directly implemented
Verification Cases: K=1 case provides testing benchmark
Missing Code: No implementation code provided (but can be implemented from formulas)

Applicable Scenarios

Theoretical Research

Probability Theory:
- Study large deviations of reinforced processes
- Analyze path-dependent random processes
- Explore limit theory of non-Markov processes
Statistical Physics:
- Large deviations of Potts models
- Mathematical foundations of lattice field theory
- Phase transitions and critical phenomena

Application Domains

Economics (★★★★★):
- Technology adoption and market share evolution
- Increasing returns and lock-in effects
- Innovation diffusion dynamics (e.g., 54)
Social Science (★★★★☆):
- Social influence processes
- Opinion dynamics
- Network effects and critical mass
Biology (★★★☆☆):
- Cell differentiation pathways
- Population dynamics
- Neural network development
Machine Learning (★★★★☆):
- Neural network training dynamics
- Theoretical foundations of reinforcement learning
- Benchmark testing for lattice field theory methods

Technical Requirements

Applicable To: Systems with finite discrete increment values
Requires: Known or estimable urn functions (transition probabilities)
Limitation: Requires large samples (N→∞) for asymptotic theory application

Overall Assessment

Dimension	Score	Remarks
Innovation	9/10	Important theoretical breakthrough, clever methodology
Rigor	7/10	Complete framework but proofs lack detail
Practicality	8/10	High application value but requires tool development
Completeness	7/10	Core results complete but lacking deep analysis
Writing Quality	8/10	Clear but could be more detailed
Overall	8/10	Excellent theoretical work with significant impact

References

Core Citations

1,2 Hill, Lane, Sudderth (1980): Foundational work on HLS urn model
3 Arthur, Ermoliev, Kaniovski (1983): Generalized urn problems and applications
7 Dembo & Zeitouni (1998): Standard textbook on large deviations techniques
8 Franchini (2017): SPLDP for K=1 case (foundation of this paper's generalization)
9 Franchini & Balzan (2023): Large deviations for increasing returns theory
18 Pemantle (2007): Survey of reinforced processes
54 Dosi, Moneta, Stepanova (2018): Empirical application motivation
56,57 Bardella, Franchini et al. (2024): Neural LFT methods

Other Important References

29 Arthur (2021): Foundations of complexity economics
30 Gottfried & Grosskinsky (2024): Asymptotic properties of nonlinear feedback
44-47 Jack, Klymko et al.: Large deviations and ergodicity of growth processes
49 Franchini & Balzan (2018): Random polymers and generalized urn processes

Summary: This is an excellent theoretical mathematics paper that successfully generalizes important large deviations theory from binary to multi-valued cases, with solid mathematical foundations and broad application prospects. Its main value lies in theoretical completeness and methodological innovation. While proof details and experimental verification could be strengthened, the explicit solution for K=2 already demonstrates method feasibility. For researchers working on urn models, reinforced processes, increasing returns theory, or lattice field theory, this is essential reading.