2025-11-16T21:22:11.887650

Rare event probabilities in Random Geometric Graphs

Deka, Luo, Wu
In this paper, we study rare events in spherical and Gaussian random geometric graphs in high dimensions. In these models, the vertices correspond to points sampled uniformly at random on the $d$ dimensional unit sphere or correspond to $d$ dimensional standard Gaussian vectors, and edges are added between two vertices if the inner-product between their corresponding points are greater than a threshold $t_p$, chosen such that the probability of having an edge is equal to $p$. We focus on two problems: (a) the probability that the RGG is a complete graph, and (b) the probability of observing an atypically large number of edges. We obtain asymptotically exponential decay rates depending on $n$ and $d$ of the probabilities of these rare events through a combination of geometric and probabilistic arguments.
academic

Rare Event Probabilities in Random Geometric Graphs

Basic Information

  • Paper ID: 2510.09196
  • Title: Rare event probabilities in Random Geometric Graphs
  • Authors: Prabhanka Deka (Beijing International Center for Mathematical Research, Peking University), Fangzhou Luo (School of Mathematical Sciences, Peking University), Baichuan Wu (School of Mathematical Sciences, Peking University)
  • Classification: math.PR (Probability Theory)
  • Publication Date: October 10, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.09196

Abstract

This paper investigates rare events in high-dimensional random geometric graphs on spheres and Gaussian random geometric graphs. In these models, vertices correspond to uniformly random points on the dd-dimensional unit sphere or standard Gaussian vectors in Rd\mathbb{R}^d, with edges added between two vertices when their inner product exceeds a threshold tpt_p, where tpt_p is chosen such that the edge probability equals pp. The paper focuses on two problems: (a) the probability that a random geometric graph is complete, and (b) the probability of observing an anomalously large number of edges. Through a combination of geometric and probabilistic arguments, the authors obtain asymptotic exponential decay rates for these rare event probabilities, which depend on the number of vertices nn and dimension dd.

Research Background and Motivation

Problem Definition

The core problems investigated in this paper concern analyzing two classes of rare events in high-dimensional random geometric graphs:

  1. Complete graph probability: The probability that a random geometric graph becomes complete (edges exist between all pairs of vertices)
  2. Large deviations in edge count: The probability of observing an anomalously large number of edges

Research Significance

  1. Theoretical importance: Random geometric graphs serve as fundamental tools for modeling complex systems, with widespread applications in computer science, biology, sociology, and physics
  2. Practical applications:
    • Anomaly detection and hypothesis testing
    • Analysis of clique structures in high-dimensional data
    • Robustness analysis of geometric network models
    • Inner product-based similarity measures in neural networks and kernel methods

Limitations of Existing Research

  • Previous work primarily fixed dimension dd while letting the number of vertices nn tend to infinity
  • Lack of systematic analysis of high-dimensional dense random geometric graphs
  • Complex dependencies between edges make analysis challenging

Core Contributions

  1. Established a complete theoretical framework: Provided a unified analytical approach for rare events in spherical and Gaussian random geometric graphs
  2. Obtained precise decay rates: Provided upper and lower bounds for complete graph probabilities and edge count large deviations under different relationships between nn and dd
  3. Developed innovative technical tools:
    • Application of spherical symmetric rearrangement techniques
    • Coupling methods between the two models
    • Organic combination of geometric and probabilistic arguments
  4. Revealed dimensional effects: Discovered that in high dimensions, random geometric graphs behave similarly to Erdős-Rényi models, while exhibiting different characteristics in low dimensions

Methodology Details

Model Definitions

Spherical Random Geometric Graphs (SRGG)

  • Vertices: (X1,,Xn)(X_1, \ldots, X_n) independently and identically uniformly distributed on the dd-dimensional unit sphere Sd1S^{d-1}
  • Edges: An edge exists between vertices ii and jj when Xi,Xjtp\langle X_i, X_j \rangle \geq t_p
  • Threshold: tpt_p is chosen such that P(X1,X2tp)=pP(\langle X_1, X_2 \rangle \geq t_p) = p

Gaussian Random Geometric Graphs (GRGG)

  • Vertices: (X~1,,X~n)(\tilde{X}_1, \ldots, \tilde{X}_n) independently and identically distributed as standard dd-dimensional normal random variables
  • Edges: An edge exists between vertices ii and jj when X~i,X~jt~p\langle \tilde{X}_i, \tilde{X}_j \rangle \geq \tilde{t}_p
  • Threshold: t~p\tilde{t}_p is chosen such that P(X~1,X~2t~p)=pP(\langle \tilde{X}_1, \tilde{X}_2 \rangle \geq \tilde{t}_p) = p

Core Technical Methods

1. Model Coupling Technique

By observing that X~/X~\tilde{X}/\|\tilde{X}\| is uniformly distributed on the sphere, the authors establish a coupling relationship between the two models:

Lemma 3.2: For fixed p<1/2p < 1/2 and small δ>0\delta > 0, there exist constants cδ,Δδc_\delta, \Delta_\delta such that: Pn,d(E~(n,d,p))exp(cδdn)+2nPn,d(E((1δ)n,d,q))P_{n,d}(\tilde{E}(n,d,p)) \leq \exp(-c_\delta dn) + 2^n P_{n,d}(E((1-\delta)n, d, q))

2. Symmetric Rearrangement Technique

Utilizing symmetric rearrangement on the sphere to handle complex geometric constraints:

Theorem 3.4: For functions f1,,fnf_1, \ldots, f_n on the sphere and increasing functions Ki,jK_{i,j}: I[f1,,fn]I[f1,,fn]I[f_1, \ldots, f_n] \leq I[f_1^*, \ldots, f_n^*] where ff^* denotes the symmetric rearrangement of ff.

3. Threshold Asymptotic Behavior

Lemma 3.1: For fixed p(0,1)p \in (0,1), as dd \to \infty:

  • t~p=(Φ1(p)+o(1))d\tilde{t}_p = (\Phi^{-1}(p) + o(1))\sqrt{d}
  • tp=(Φ1(p)+o(1))/dt_p = (\Phi^{-1}(p) + o(1))/\sqrt{d}

Main Proof Strategies

Lower Bound Proofs

  1. Erdős-Rényi type lower bounds: Proving P(E)p(n2)P(E) \geq p^{\binom{n}{2}} through geometric arguments
  2. Biasing arguments: Imposing small biases on the first coordinate of all vectors in the Gaussian model
  3. Dimension-dependent bounds: When logn<εd\log n < \varepsilon d, P(E~)exp(Cndlogn)P(\tilde{E}) \geq \exp(-Cn\sqrt{d \log n})

Upper Bound Proofs

  1. Bayesian arguments: Utilizing properties of the statistic S=ijX~i,X~jS = \sum_{i \neq j} \langle \tilde{X}_i, \tilde{X}_j \rangle
  2. Spherical cap process analysis: Transforming complex convex set processes into spherical cap processes through symmetric rearrangement
  3. Moment generating function methods: Applying exponential Markov inequalities to edge count deviation problems

Experimental Results

Main Results on Complete Graph Probability

According to Theorem 2.1 and Theorem 2.2, the complete graph probability exhibits different decay rates across various dimensional regimes:

Dimensional RegimeLower BoundUpper Bound
dn2d \gtrsim n^2Cn2-Cn^2cn2-cn^2
n2/logndn2n^2/\log n \lesssim d \lesssim n^2Cnd-Cn\sqrt{d}cn2-cn^2
n4/3+o(1)dn2/lognn^{4/3+o(1)} \lesssim d \lesssim n^2/\log nCnd-Cn\sqrt{d}cndlogn-cn\sqrt{d \log n}
logndn4/3+o(1)\log n \lesssim d \lesssim n^{4/3+o(1)}Cndlogn-Cn\sqrt{d \log n}cndlogn-cn\sqrt{d \log n}
dlognd \lesssim \log nCnd-Cndcnd-cnd

Main Results on Large Deviations in Edge Count

Theorems 2.3 and 2.4 provide precise bounds for large deviations in edge count:

For the event Iε:={E(G)(1+ε)(n2)p}I_\varepsilon := \{|E(G)| \geq (1+\varepsilon)\binom{n}{2}p\}: exp(cmin(n2,nd))P(Iε)exp(Cmin(n2,nd))\exp(-c\min(n^2, n\sqrt{d})) \leq P(I_\varepsilon) \leq \exp(-C\min(n^2, n\sqrt{d}))

Key Findings

  1. Dimensional threshold effects: When dnd \gtrsim \sqrt{n}, the decay rate is exp(Ω(n2))\exp(-\Omega(n^2)), similar to Erdős-Rényi models
  2. Persistence of geometric effects: When dnd \lesssim \sqrt{n}, the decay rate is exp(Ω(nd))\exp(-\Omega(n\sqrt{d})), reflecting the influence of geometric structure
  3. Matching bounds: Matching upper and lower bounds are obtained in the regimes dn2d \geq n^2 and dn4/3+o(1)d \leq n^{4/3+o(1)}

High-Dimensional Random Geometric Graphs

  • Devroye et al. (2011): Investigated clique numbers in the regime dlog(n)d \gg \log(n)
  • Bubeck et al. (2016): Established a phase transition in geometric detectability: geometrically detectable when dn3d \ll n^3, undetectable when dn3d \gg n^3

Large Deviations Theory

  • Chatterjee & Harel (2020): Studied large deviations of edge counts in random geometric graphs generated by Poisson point processes
  • Schreiber & Yukich (2005): Established large deviations principles for functionals of spatial point processes

Symmetric Rearrangement Techniques

  • Draghici (2005): Developed rearrangement inequality theory on spheres, providing foundational techniques for this paper's core methods

Technical Innovations

1. Innovative Application of Coupling Techniques

Establishing connections between the two models through the spherical projection X~/X~\tilde{X}/\|\tilde{X}\|, avoiding the complexity of redundant analysis.

2. Probabilistic Applications of Geometric Tools

Innovatively applying symmetric rearrangement—a geometric analysis tool—to probabilistic problems, particularly in handling complex edge dependencies.

3. Multi-Scale Analysis Framework

Establishing a unified analytical framework for different (n,d)(n,d) relationships, revealing transition behavior from low to high dimensions.

Limitations and Future Directions

Limitations

  1. Technical constraints: Gaps exist between upper and lower bounds in the intermediate regime n4/3dn2n^{4/3} \lesssim d \lesssim n^2
  2. Model constraints: Analysis primarily focuses on p1/2p \leq 1/2, with relatively limited treatment of p>1/2p > 1/2
  3. Method limitations: Information loss in the symmetric rearrangement process results in suboptimal bounds

Future Research Directions

  1. Refining theoretical bounds: Closing gaps in the intermediate dimensional regime
  2. Extending models: Considering more general values of pp and other geometric metrics
  3. Algorithmic applications: Applying theoretical results to practical network analysis and machine learning problems
  4. Dynamic models: Investigating rare events in time-varying random geometric graphs

In-Depth Evaluation

Strengths

  1. Theoretical depth: Establishes a complete mathematical framework combining deep results from geometry, probability theory, and analysis
  2. Technical innovation: The application of symmetric rearrangement techniques to random graph theory is pioneering
  3. Result completeness: Precise upper and lower bounds across multiple dimensional regimes demonstrate the problem's complexity
  4. Method generality: Developed techniques can be generalized to other geometric random graph problems

Weaknesses

  1. Completeness defects: Unmatched bounds in the intermediate regime affect result completeness
  2. Practical limitations: Primarily theoretical results with limited numerical verification and practical applications
  3. Technical complexity: Complex proof techniques may limit the generalizability of results

Academic Impact

  1. Theoretical contribution: Provides important theoretical foundations for high-dimensional random geometric graph theory
  2. Methodological contribution: Application of symmetric rearrangement techniques provides new analytical tools for related problems
  3. Inspirational value: Reveals the critical role of dimension in random geometric graphs, providing direction for subsequent research

Applicable Scenarios

  1. Theoretical research: Random graph theory, geometric probability, high-dimensional phenomena
  2. Application domains: Network science, kernel methods in machine learning, high-dimensional statistics
  3. Algorithm design: Algorithm analysis and optimization based on geometric graphs

Conclusion

This paper achieves important theoretical breakthroughs in analyzing rare events in random geometric graphs. By innovatively combining symmetric rearrangement techniques with probabilistic methods, it provides systematic analysis of complete graph probabilities and edge count large deviations in high-dimensional spherical and Gaussian random geometric graphs. Although there remains room for improvement in certain technical details, the theoretical framework established and the profound results obtained provide a solid foundation for the development of this field, possessing significant academic value and inspirational significance.