The manifold hypothesis asserts that data of interest in high-dimensional ambient spaces, such as image data, lies on unknown low-dimensional submanifolds. Diffusion models (DMs) -- which operate by convolving data with progressively larger amounts of Gaussian noise and then learning to revert this process -- have risen to prominence as the most performant generative models, and are known to be able to learn distributions with low-dimensional support. For a given datum in one of these submanifolds, we should thus intuitively expect DMs to have implicitly learned its corresponding local intrinsic dimension (LID), i.e. the dimension of the submanifold it belongs to. Kamkari et al. (2024b) recently showed that this is indeed the case by linking this LID to the rate of change of the log marginal densities of the DM with respect to the amount of added noise, resulting in an LID estimator known as FLIPD. LID estimators such as FLIPD have a plethora of uses, among others they quantify the complexity of a given datum, and can be used to detect outliers, adversarial examples and AI-generated text. FLIPD achieves state-of-the-art performance at LID estimation, yet its theoretical underpinnings are incomplete since Kamkari et al. (2024b) only proved its correctness under the highly unrealistic assumption of affine submanifolds. In this work we bridge this gap by formally proving the correctness of FLIPD under realistic assumptions. Additionally, we show that an analogous result holds when Gaussian convolutions are replaced with uniform ones, and discuss the relevance of this result.
On Convolutions, Intrinsic Dimension, and Diffusion Models
- Paper ID: 2506.20705
- Title: On Convolutions, Intrinsic Dimension, and Diffusion Models
- Authors: Kin Kwan Leung, Rasa Hosseinzadeh, Gabriel Loaiza-Ganem (Layer 6 AI)
- Classification: cs.LG cs.AI stat.ML
- Publication Time/Venue: Transactions on Machine Learning Research (10/2025)
- Paper Link: https://arxiv.org/abs/2506.20705
The manifold hypothesis posits that data of interest (such as image data) in high-dimensional ambient space lies on an unknown low-dimensional submanifold. Diffusion models (DMs) have become the highest-performing generative models by progressively convolving data with increasing Gaussian noise and learning to reverse this process, and are known to be capable of learning distributions with low-dimensional support. For a given data point on these submanifolds, we intuitively expect that DMs have implicitly learned its corresponding local intrinsic dimension (LID), i.e., the dimensionality of the submanifold to which it belongs. Kamkari et al. (2024b) recently proved this to be the case by connecting LID to the rate of change of the DM's log marginal density with respect to the amount of added noise, yielding an LID estimator called FLIPD. FLIPD achieves state-of-the-art performance in LID estimation, but its theoretical foundation is incomplete, as Kamkari et al. (2024b) proved its correctness only under the highly unrealistic assumption of affine submanifolds. This paper bridges this gap by formally proving the correctness of FLIPD under realistic assumptions. Furthermore, we prove that similar results hold when Gaussian convolution is replaced by uniform convolution, and discuss the relevance of this result.
The core problem addressed in this paper is to provide a rigorous theoretical foundation for the FLIPD (Flow-based Local Intrinsic Dimension) estimator. Specifically:
- Theoretical Deficiency: Although FLIPD proposed by Kamkari et al. demonstrates excellent practical performance, its theoretical proof only holds under the unrealistic assumption of affine submanifolds
- Practical Requirement: There is a need to prove FLIPD's correctness on general embedded submanifolds, aligning its theoretical foundation with practical applications
Local intrinsic dimension (LID) estimation has important applications in machine learning:
- Complexity Quantification: Effectively quantifying image complexity
- Anomaly Detection: Detecting outliers, adversarial examples, and AI-generated text
- Generalization Prediction: LID estimation of neural network representations can predict generalization performance
- Memorization Detection: Identifying model memorization phenomena
Traditional LID estimators suffer from the following problems:
- High Computational Complexity: Relying on pairwise distance calculations, scaling poorly with dataset size and ambient dimension
- Curse of Dimensionality: Performance degradation in high-dimensional spaces
- Incomplete Theory: Although FLIPD demonstrates superior performance, its theoretical foundation is weak
- Theoretical Refinement: Formally proves FLIPD's correctness under realistic assumptions, extending it from affine submanifolds to general smooth embedded submanifolds
- Result Extension: Proves that similar results hold when Gaussian convolution is replaced by uniform convolution
- Mathematical Rigor: Provides complete mathematical proofs, including sophisticated differential geometric analysis
- Practical Value: Provides theoretical guarantees for FLIPD's reliability in practical applications
The core of this paper is proving that the following key equation holds under general conditions:
LID(x)=D+limδ→−∞∂δ∂logϱN(x,δ)
where:
- ϱN(x,δ) is the convolution of the data distribution with Gaussian noise of log standard deviation δ
- D is the ambient space dimension
- δ→−∞ corresponds to the limit as noise approaches zero
Theorem 1 (Gaussian Case): Let M be a smooth d-dimensional embedded submanifold in RD, and p be a probability density function on M. For x∈M, if p is continuous at x, p(x)>0, and satisfies finite second moment conditions, then:
limδ→−∞∂δ∂logϱN(x,δ)=d−D
Theorem 2 (Uniform Case): Similar results hold for uniform distribution convolution:
limδ→−∞∂δ∂logϱU(x,δ)=d−D
The core idea of the proof exploits the decomposition properties of Gaussian and uniform densities:
- Gaussian Case: Utilizing the relationship
ND(x−x′;0,δ)=(2π)2d−Deδ(d−D)Nd(x−x′;0,δ)
- Uniform Case: Utilizing similar decomposition
UD(x;μ,δ)=CDU(CdU)−1eδ(d−D)Ud(x;μ,δ)
- Limit Analysis: Through refined differential geometric analysis, proving that the limit of the derivative converges to the expected value
This paper is primarily theoretical work without large-scale experimental validation. The authors focus on:
- Mathematical Proofs: Providing rigorous theoretical analysis
- Condition Verification: Ensuring that the proposed conditions are reasonable in practical applications
- Extensibility Analysis: Extending results from single submanifolds to disjoint unions of submanifolds
The paper verifies the completeness of the theory through the following corollaries:
Corollary 1: For disjoint unions of submanifolds M=∪jMj, under appropriate separation conditions, the results still hold.
Corollary 2: Similar extensions for the uniform case also hold.
These theoretical results directly imply:
- FLIPD Correctness: When the score function is learned perfectly, limδ→−∞FLIPD(x;δ)=LID(x)
- Negative Value Interpretation: FLIPD producing negative estimates can only be attributed to imperfect score function learning, not theoretical defects
- Traditional Methods: Statistical estimators based on pairwise distances or angles (Fukunaga & Olsen, 1971; Levina & Bickel, 2004, etc.)
- Generative Model Methods:
- Variational autoencoder approaches (Zheng et al., 2022)
- Normalizing flow approaches (Tempczyk et al., 2022)
- Diffusion model approaches (Stanczuk et al., 2024; Horvat & Pfister, 2024)
- Stanczuk et al. Method: Also based on diffusion models but requires more function evaluations
- Horvat & Pfister Method: Requires modifying the DM training process
- FLIPD Advantages: Compatible with off-the-shelf state-of-the-art DMs (e.g., Stable Diffusion)
- Theoretical Refinement: Successfully extends FLIPD's theoretical foundation from affine submanifolds to general smooth embedded submanifolds
- Method Generality: Proves similar results for both Gaussian and uniform convolution cases
- Practical Value: Provides mathematical guarantees for FLIPD's reliability in practical applications
- Perfect Score Function Assumption: Theoretical results assume perfect score function learning; approximation errors exist in practice
- Condition Restrictions: Requires satisfaction of continuity and finite second moment conditions
- Connectivity Requirements: Finite second moment conditions implicitly require manifold connectivity
- Error Analysis: Quantifying the impact of score function learning errors on LID estimation
- Flow Matching Extension: Extending results to flow matching methods
- Distribution Extension: Investigating similar results under other noise distributions
- Theoretical Rigor: Provides complete mathematical proofs using advanced differential geometric tools
- Practical Value: Provides theoretical foundation for an already high-performing method
- Complete Results: Proves not only the Gaussian case but also extends to the uniform distribution case
- Clear Presentation: Complex mathematical content is well-organized and easy to understand
- Lack of Experimental Validation: As theoretical work, lacks experimental verification of theoretical predictions
- Condition Restrictions: Some assumed conditions may not be fully satisfied in practical applications
- Insufficient Error Analysis: Lacks in-depth analysis of error sources in practical applications
- Academic Contribution: Provides important theoretical foundation for the intersection of generative models and manifold learning
- Practical Value: Enhances the credibility of FLIPD in practical applications
- Inspirational Value: Provides theoretical framework for other geometric analysis methods based on generative models
These theoretical results apply to:
- High-Dimensional Data Analysis: Particularly for data following the manifold hypothesis
- Anomaly Detection: Using LID for outlier detection
- Generative Model Evaluation: Assessing the capability of generative models to learn data manifolds
- Neural Network Analysis: Analyzing geometric properties of network representations
The paper cites extensive related work, including:
- Kamkari et al. (2024b): Original work proposing FLIPD
- Classical LID estimation methods: Levina & Bickel (2004), Facco et al. (2017), etc.
- Diffusion model theory: Song et al. (2021), De Bortoli (2022), etc.
- Manifold learning related: Lee (2012, 2018) and other differential geometry textbooks
Summary: This is a high-quality theoretical paper that provides rigorous mathematical foundation for the important practical method FLIPD. Although lacking experimental validation, its theoretical contributions are valuable for understanding the relationship between generative models and manifold geometry.