Hypothesis testing for the dimension of random geometric graph
Yuan, Yu
Random geometric graphs (RGGs) offer a powerful tool for analyzing the geometric and dependence structures in real-world networks. For example, it has been observed that RGGs are a good model for protein-protein interaction networks. In RGGs, nodes are randomly distributed over an $m$-dimensional metric space, and edges connect the nodes if and only if their distance is less than some threshold. When fitting RGGs to real-world networks, the first step is probably to input or estimate the dimension $m$. However, it is not clear whether the prespecified dimension is equal to the true dimension. In this paper, we investigate this problem using hypothesis testing. Under the null hypothesis, the dimension is equal to a specific value, while the alternative hypothesis asserts the dimension is not equal to that value. We propose the first statistical test. Under the null hypothesis, the proposed test statistic converges in law to the standard normal distribution, and under the alternative hypothesis, the test statistic is unbounded in probability. We derive the asymptotic distribution by leveraging the asymptotic theory of degenerate U-statistics with kernel function dependent on the number of nodes. This approach differs significantly from prevailing methods used in network hypothesis testing problems. Moreover, we also propose an efficient approach to compute the test statistic based on the adjacency matrix. Simulation studies show that the proposed test performs well. We also apply the proposed test to multiple real-world networks to test their dimensions.
academic
Hypothesis testing for the dimension of random geometric graph
Random geometric graphs (RGGs) provide powerful tools for analyzing geometric and dependency structures in real-world networks. In RGGs, nodes are randomly distributed in an m-dimensional metric space and connected by edges if and only if the distance between nodes is below a certain threshold. When fitting RGGs to real networks, a primary step is to input or estimate the dimension m. However, it remains unclear whether the preset dimension equals the true dimension. This paper addresses this question through hypothesis testing: the null hypothesis states that the dimension equals a specific value, while the alternative hypothesis states that the dimension differs from that value. The authors propose the first statistical testing method, where the test statistic converges in distribution to a standard normal distribution under the null hypothesis and becomes unbounded in probability under the alternative hypothesis.
Core Problem: When fitting random geometric graphs to real networks, how can one verify whether the preset or estimated dimension m equals the true dimension?
Practical Need: In existing research, researchers typically assume dimension values directly (e.g., assuming m=2,3,4 in protein interaction networks), but lack statistical verification methods
Application Importance: RGGs are widely applied in protein interaction networks, social networks, brain networks, and other domains
This paper cites 40 important references covering random geometric graph theory, network analysis, and statistical theory, providing solid theoretical foundation. Key references include Fan & Li (1996) on U-statistics theory, Higham et al. (2008) on protein network applications, and recent related survey articles.
Overall Assessment: This is a high-quality statistical methodology paper with excellent performance in theoretical innovation, method design, and experimental verification. Despite some limitations, it makes important contributions to the network analysis field with significant academic value and practical significance.