2025-11-22T15:25:16.453421

Complexity and accessibility of random landscapes

Pahujani, Krug

These notes introduce probabilistic landscape models defined on high-dimensional discrete sequence spaces. The models are motivated primarily by fitness landscapes in evolutionary biology, but links to statistical physics and computer science are mentioned where appropriate. Elementary and advanced results on the structure of landscapes are described with a focus on features that are relevant to evolutionary searches, such as the number of local maxima and the existence of fitness-monotonic paths. The recent discovery of submodularity as a biologically meaningful property of fitness landscapes and its consequences for their accessibility is discussed in detail.

academic

Complexity and Accessibility of Random Landscapes

Basic Information

Paper ID: 2502.05896
Title: Complexity and Accessibility of Random Landscapes
Authors: Sakshi Pahujani, Joachim Krug (University of Cologne)
Classification: q-bio.PE (Population and Evolution), cond-mat.dis-nn (Disordered Systems), math.PR (Probability)
Publication Date: 2025 (SciPost Physics Lecture Notes Submission)
Paper Link: https://arxiv.org/abs/2502.05896

Abstract

This paper introduces probabilistic landscape models defined on high-dimensional discrete sequence spaces. These models are primarily inspired by adaptive landscapes in evolutionary biology, while also engaging with relevant topics in statistical physics and computer science. The article describes both foundational and advanced results on landscape structure, with emphasis on characteristics relevant to evolutionary search, such as the number of local maxima and the existence of adaptive monotonic paths. Special attention is given to recent findings on submodularity as a biologically meaningful property of adaptive landscapes and its implications for landscape accessibility.

Research Background and Motivation

Core Questions

High-dimensional landscape navigation problem: Multiple fields including biological evolution, spin glass systems, and neural network optimization involve navigation on high-dimensional complex landscapes
Structural characteristics of adaptive landscapes: Understanding the distribution and accessibility of local maxima (peaks) in adaptive landscapes
Wright vs. Fisher debate: Resolving the classical controversy in evolutionary biology regarding whether adaptive landscapes are rugged and difficult to navigate (Wright's view) or relatively accessible (Fisher's view)

Research Significance

Interdisciplinary application: This research bridges evolutionary biology, statistical physics, and computer science
Practical relevance: Helps understand the predictability and repeatability of evolutionary processes
Theoretical value: Provides mathematical frameworks and analytical tools for high-dimensional random landscapes

Limitations of Existing Approaches

Fully random models (such as the House of Cards model) are oversimplified and cannot reflect correlations in real biological systems
Lack of systematic understanding of accessibility in structured landscapes
Insufficient recognition of the biological significance of important mathematical properties such as submodularity

Core Contributions

Unified mathematical framework: Establishes a complete theoretical system for analyzing probabilistic landscapes on high-dimensional discrete sequence spaces
Accessibility phase transition theory: Reveals phase transition phenomena in the existence of accessible paths in random landscapes and determines critical thresholds
Connection between submodularity and accessibility: First systematically elucidates the subset-superset accessibility property of submodular adaptive landscapes
Adaptive basin of attraction theory: Provides exponential lower bounds on the size of adaptive basins of attraction in submodular landscapes
Interdisciplinary connections: Establishes mappings between Fisher's geometric model and the antiferromagnetic Hopfield model

Methodology Details

Task Definition

Study adaptive landscapes defined on high-dimensional discrete sequence spaces $\{0,1,...,a-1\}^L$ , analyzing their structural characteristics (such as peak numbers) and dynamical properties (such as the existence of accessible paths).

Core Models

1. House of Cards (HoC) Model

Definition: Fitness values are independent and identically distributed continuous random variables
Peak probability: $P_{\max} = \frac{1}{(a-1)L+1}$
Expected number of peaks: $E(N_L) = \frac{a^L}{(a-1)L+1}$
Complexity: $\Lambda = \lim_{L→∞} \frac{1}{L}\log E(N_L) = \ln a$

2. Accessibility Analysis

Direct path accessibility:

Probability: $P_{β,l} = \frac{β^{l-1}}{(l-1)!}$
Expected number of paths: $E(X_{α,ω}) = lβ^{l-1}$
Critical threshold: $β_c(l) = 1 - \frac{\ln l}{l}$

Indirect path accessibility:

Extended adaptive landscape method for handling self-intersecting paths
Expected quasi-accessible path count: $E[\tilde{X}_{α,ω}] ∼ \prod_{k,l=0}^{a-1}[(e^βA)_{k,l}]^{p_{k,l}L}$
Binary case condition: $\sinh(β_c)^δ \cosh(β_c)^{1-δ} = 1$

3. Structured Landscapes

NK model: $g(σ) = \sum_{i=1}^b g_i(σ_{i,1}, σ_{i,2}, ..., σ_{i,k})$

Rough Mount Fuji model: $g(σ) = -cd(σ,σ^*) + ξ_σ$

Composite genotype-phenotype-fitness mapping: $g(σ) = Φ[z(σ)], \quad z(σ) = \sum_{i=1}^L \sum_{μ=0}^{a-1} a_{i,μ}δ_{σ_i,μ}$

Technical Innovations

1. Submodularity Theory

Universal epistasis condition: $g(σ ∪ τ) - g(σ) ≤ g(σ' ∪ τ) - g(σ')$ , where $σ' ⊆ σ$
Equivalent to submodularity: $g(A ∪ B) + g(A ∩ B) ≤ g(A) + g(B)$
Biological construction: Concave phenotype-fitness mapping produces submodular landscapes

2. Subset-Superset Accessibility Property

Theorem: Any peak can be reached from all its subsets and supersets via direct paths
Proof strategy: Utilizes universal negative epistasis condition and local optimality of peaks

3. Adaptive Basin of Attraction

Lower bound formula: $S_σ ≥ 2^{|σ|} + 2^{L-|σ|} - 2$
Exponential growth: Basin of attraction size grows exponentially with genotype space

Experimental Setup

Theoretical Analysis Framework

The paper primarily employs theoretical analysis methods, including:

Probabilistic analysis (Markov inequality, central limit theorem)
Combinatorial optimization theory (submodular function theory)
Percolation theory (accessibility phase transitions)
Graph-theoretic methods (Hamming graphs, fitness graphs)

Mathematical Tools

Hamming distance: $d(σ,τ) = \sum_{i=1}^L (1-δ_{σ_i,τ_i})$
Fitness graph: Directed acyclic graph constructed by directing edges toward increasing fitness
Complexity definition: $\Lambda = \lim_{L→∞} \frac{1}{L}\log E(N_L)$

Experimental Results

Main Theoretical Results

1. Exact Solutions for HoC Model

Peak statistics: Proves that peak numbers satisfy the central limit theorem with sub-Poisson statistics
Variance formula: $\text{Var}(N_L) = \frac{a^L(a-1)(L-1)}{2\{(a-1)L+1\}^2}$
Resolution of Wright-Fisher debate: In the high-dimensional limit, the probability that any single genotype is a peak approaches zero (supporting Fisher), but the total number of peaks approaches infinity (supporting Wright)

2. Accessibility Phase Transition Phenomena

Critical behavior: Existence of well-defined phase transition threshold $β_c(l) = 1 - \frac{\ln l}{l}$
Phase transition characteristics:
- $β < β_c(l)$ : $\lim_{l→∞} P[X_{α,ω} ≥ 1] = 0$
- $β > β_c(l)$ : $\lim_{l→∞} P[X_{α,ω} ≥ 1] = 1$

3. Special Properties of Submodular Landscapes

Universal accessibility: Any peak can be reached from all its subsets and supersets
Large basins of attraction: Basin of attraction size has exponential lower bounds, far exceeding linear lower bounds in general cases

Case Studies

Submodularity of Fisher's Geometric Model

For Fisher's geometric model with one-dimensional phenotype:

Genotype-phenotype mapping: $z(σ) = \sum_{i=1}^L a_i σ_i$ ( $a_i > 0$ )
Phenotype-fitness mapping: $Φ(z)$ is a concave function
Result: Produces submodular adaptive landscape with accessibility properties

Connection with Hopfield Model

By choosing $Φ = -z^2$ , establishes mapping with antiferromagnetic Hopfield model: $H = \sum_{i,j} J_{ij}η_iη_j + \sum_i h_iη_i$ where $J_{ij} = \frac{1}{4}a_ia_j$ , $h_i = -\frac{1}{2}(\sum_j a_j)a_i$

Historical Development

Wright (1932): Introduced the concept of adaptive landscapes, emphasizing their ruggedness
Fisher (1958): Geometric model predicting smoothness of high-dimensional landscapes
Kauffman (1987): NK model, a landscape model with tunable ruggedness

Modern Research

Empirical studies: Recent two decades of experimental research on adaptive landscapes in real biological systems
Mathematical theory: Applications of percolation theory, random geometry, and combinatorial optimization to adaptive landscapes
Computational methods: High-throughput experimental techniques enabling large-scale adaptive landscape studies

Interdisciplinary Connections

Statistical physics: Equivalent to Random Energy Model in spin glass theory
Computer science: Related to submodular function maximization in combinatorial optimization
Machine learning: Potential connections with research on neural network loss landscapes

Conclusions and Discussion

Main Conclusions

Resolution of Wright-Fisher debate: Both viewpoints are correct at different levels
Universality of accessibility phase transitions: Universal phase transition phenomena exist in random landscapes
Important role of submodularity: Submodularity provides strong accessibility guarantees for adaptive landscapes
Large basin of attraction phenomenon: Submodular landscapes possess adaptive basins of attraction of exponential size

Limitations

Model simplification: Binary sequence assumption limits application in multi-allelic systems
Continuous fitness assumption: Non-degenerate fitness function assumption may not hold in practice
Theory-practice gap: Correspondence between theoretical predictions and actual biological systems requires further verification

Future Directions

Machine learning applications: Apply submodularity concepts to deep learning loss landscape analysis
Multi-dimensional phenotypes: Extend to more general multi-dimensional Fisher geometric models
Empirical validation: Verify theoretical predictions through high-throughput experiments
Dynamic environments: Study adaptive landscape evolution in changing environments

In-Depth Evaluation

Strengths

Theoretical depth: Provides a rigorous mathematical framework for adaptive landscape research
Interdisciplinary perspective: Successfully connects concepts from biology, physics, and mathematics
Practical value: Provides important insights for understanding actual evolutionary processes
Mathematical rigor: All major results have rigorous mathematical proofs

Weaknesses

Limited empirical support: Primarily theoretical work lacking extensive empirical data support
Model limitations: Certain assumptions may not hold in actual biological systems
Computational complexity: Computational verification of certain theoretical results remains difficult for large-scale systems

Impact

Theoretical contribution: Provides important mathematical tools for adaptive landscape theory
Methodological innovation: Technical innovations such as extended adaptive landscape methods have broad application prospects
Interdisciplinary influence: Likely to impact statistical physics, computer science, and other fields

Applicable Scenarios

Evolutionary biology: Understanding path dependence in natural selection processes
Protein engineering: Guiding directed evolution experiment design
Optimization algorithms: Inspiring new global optimization algorithm design
Machine learning: Understanding landscape structure in neural network training

References

This paper cites 68 important references spanning from classical foundational work by Wright and Fisher to recent empirical research, reflecting the complete development history of the field. Key references include:

Wright, S. (1932): Original concept of adaptive landscapes
Fisher, R.A. (1958): Geometric model proposal
Kauffman & Levin (1987): House of Cards model
Crona et al. (2023): Geometric classification of universal epistasis
Krug & Oros (2024): Systematic study of submodularity and accessibility

This paper provides important theoretical foundations for adaptive landscape research, particularly through the introduction of submodularity concepts, offering new perspectives for understanding evolution in complex adaptive systems. Its interdisciplinary approach and rigorous mathematical analysis make it an important contribution to the field.