2025-11-16T08:55:12.135200

On Convolutions, Intrinsic Dimension, and Diffusion Models

Leung, Hosseinzadeh, Loaiza-Ganem

The manifold hypothesis asserts that data of interest in high-dimensional ambient spaces, such as image data, lies on unknown low-dimensional submanifolds. Diffusion models (DMs) -- which operate by convolving data with progressively larger amounts of Gaussian noise and then learning to revert this process -- have risen to prominence as the most performant generative models, and are known to be able to learn distributions with low-dimensional support. For a given datum in one of these submanifolds, we should thus intuitively expect DMs to have implicitly learned its corresponding local intrinsic dimension (LID), i.e. the dimension of the submanifold it belongs to. Kamkari et al. (2024b) recently showed that this is indeed the case by linking this LID to the rate of change of the log marginal densities of the DM with respect to the amount of added noise, resulting in an LID estimator known as FLIPD. LID estimators such as FLIPD have a plethora of uses, among others they quantify the complexity of a given datum, and can be used to detect outliers, adversarial examples and AI-generated text. FLIPD achieves state-of-the-art performance at LID estimation, yet its theoretical underpinnings are incomplete since Kamkari et al. (2024b) only proved its correctness under the highly unrealistic assumption of affine submanifolds. In this work we bridge this gap by formally proving the correctness of FLIPD under realistic assumptions. Additionally, we show that an analogous result holds when Gaussian convolutions are replaced with uniform ones, and discuss the relevance of this result.

academic

On Convolutions, Intrinsic Dimension, and Diffusion Models

Basic Information

Paper ID: 2506.20705
Title: On Convolutions, Intrinsic Dimension, and Diffusion Models
Authors: Kin Kwan Leung, Rasa Hosseinzadeh, Gabriel Loaiza-Ganem (Layer 6 AI)
Classification: cs.LG cs.AI stat.ML
Publication Time/Venue: Transactions on Machine Learning Research (10/2025)
Paper Link: https://arxiv.org/abs/2506.20705

Abstract

The manifold hypothesis posits that data of interest (such as image data) in high-dimensional ambient space lies on an unknown low-dimensional submanifold. Diffusion models (DMs) have become the highest-performing generative models by progressively convolving data with increasing Gaussian noise and learning to reverse this process, and are known to be capable of learning distributions with low-dimensional support. For a given data point on these submanifolds, we intuitively expect that DMs have implicitly learned its corresponding local intrinsic dimension (LID), i.e., the dimensionality of the submanifold to which it belongs. Kamkari et al. (2024b) recently proved this to be the case by connecting LID to the rate of change of the DM's log marginal density with respect to the amount of added noise, yielding an LID estimator called FLIPD. FLIPD achieves state-of-the-art performance in LID estimation, but its theoretical foundation is incomplete, as Kamkari et al. (2024b) proved its correctness only under the highly unrealistic assumption of affine submanifolds. This paper bridges this gap by formally proving the correctness of FLIPD under realistic assumptions. Furthermore, we prove that similar results hold when Gaussian convolution is replaced by uniform convolution, and discuss the relevance of this result.

Research Background and Motivation

Problem Definition

The core problem addressed in this paper is to provide a rigorous theoretical foundation for the FLIPD (Flow-based Local Intrinsic Dimension) estimator. Specifically:

Theoretical Deficiency: Although FLIPD proposed by Kamkari et al. demonstrates excellent practical performance, its theoretical proof only holds under the unrealistic assumption of affine submanifolds
Practical Requirement: There is a need to prove FLIPD's correctness on general embedded submanifolds, aligning its theoretical foundation with practical applications

Importance Analysis

Local intrinsic dimension (LID) estimation has important applications in machine learning:

Complexity Quantification: Effectively quantifying image complexity
Anomaly Detection: Detecting outliers, adversarial examples, and AI-generated text
Generalization Prediction: LID estimation of neural network representations can predict generalization performance
Memorization Detection: Identifying model memorization phenomena

Limitations of Existing Methods

Traditional LID estimators suffer from the following problems:

High Computational Complexity: Relying on pairwise distance calculations, scaling poorly with dataset size and ambient dimension
Curse of Dimensionality: Performance degradation in high-dimensional spaces
Incomplete Theory: Although FLIPD demonstrates superior performance, its theoretical foundation is weak

Core Contributions

Theoretical Refinement: Formally proves FLIPD's correctness under realistic assumptions, extending it from affine submanifolds to general smooth embedded submanifolds
Result Extension: Proves that similar results hold when Gaussian convolution is replaced by uniform convolution
Mathematical Rigor: Provides complete mathematical proofs, including sophisticated differential geometric analysis
Practical Value: Provides theoretical guarantees for FLIPD's reliability in practical applications

Methodology Details

Core Theoretical Result

The core of this paper is proving that the following key equation holds under general conditions:

$\text{LID}(x) = D + \lim_{\delta \to -\infty} \frac{\partial}{\partial \delta} \log \varrho_N(x, \delta)$

where:

$\varrho_N(x, \delta)$ is the convolution of the data distribution with Gaussian noise of log standard deviation $\delta$
$D$ is the ambient space dimension
$\delta \to -\infty$ corresponds to the limit as noise approaches zero

Main Theorems

Theorem 1 (Gaussian Case): Let $M$ be a smooth $d$ -dimensional embedded submanifold in $\mathbb{R}^D$ , and $p$ be a probability density function on $M$ . For $x \in M$ , if $p$ is continuous at $x$ , $p(x) > 0$ , and satisfies finite second moment conditions, then:

$\lim_{\delta \to -\infty} \frac{\partial}{\partial \delta} \log \varrho_N(x, \delta) = d - D$

Theorem 2 (Uniform Case): Similar results hold for uniform distribution convolution:

$\lim_{\delta \to -\infty} \frac{\partial}{\partial \delta} \log \varrho_U(x, \delta) = d - D$

Proof Strategy

The core idea of the proof exploits the decomposition properties of Gaussian and uniform densities:

Gaussian Case: Utilizing the relationship $N_D(x-x'; 0, \delta) = (2\pi)^{\frac{d-D}{2}} e^{\delta(d-D)} N_d(x-x'; 0, \delta)$
Uniform Case: Utilizing similar decomposition $U_D(x;\mu, \delta) = C_D^U (C_d^U)^{-1} e^{\delta(d-D)} U_d(x;\mu, \delta)$
Limit Analysis: Through refined differential geometric analysis, proving that the limit of the derivative converges to the expected value

Experimental Setup

This paper is primarily theoretical work without large-scale experimental validation. The authors focus on:

Mathematical Proofs: Providing rigorous theoretical analysis
Condition Verification: Ensuring that the proposed conditions are reasonable in practical applications
Extensibility Analysis: Extending results from single submanifolds to disjoint unions of submanifolds

Experimental Results

Theoretical Result Verification

The paper verifies the completeness of the theory through the following corollaries:

Corollary 1: For disjoint unions of submanifolds $M = \cup_j M_j$ , under appropriate separation conditions, the results still hold.

Corollary 2: Similar extensions for the uniform case also hold.

Practical Implications

These theoretical results directly imply:

FLIPD Correctness: When the score function is learned perfectly, $\lim_{\delta \to -\infty} \text{FLIPD}(x; \delta) = \text{LID}(x)$
Negative Value Interpretation: FLIPD producing negative estimates can only be attributed to imperfect score function learning, not theoretical defects

Classification of LID Estimation Methods

Traditional Methods: Statistical estimators based on pairwise distances or angles (Fukunaga & Olsen, 1971; Levina & Bickel, 2004, etc.)
Generative Model Methods:
- Variational autoencoder approaches (Zheng et al., 2022)
- Normalizing flow approaches (Tempczyk et al., 2022)
- Diffusion model approaches (Stanczuk et al., 2024; Horvat & Pfister, 2024)

Comparison with FLIPD

Stanczuk et al. Method: Also based on diffusion models but requires more function evaluations
Horvat & Pfister Method: Requires modifying the DM training process
FLIPD Advantages: Compatible with off-the-shelf state-of-the-art DMs (e.g., Stable Diffusion)

Conclusions and Discussion

Main Conclusions

Theoretical Refinement: Successfully extends FLIPD's theoretical foundation from affine submanifolds to general smooth embedded submanifolds
Method Generality: Proves similar results for both Gaussian and uniform convolution cases
Practical Value: Provides mathematical guarantees for FLIPD's reliability in practical applications

Limitations

Perfect Score Function Assumption: Theoretical results assume perfect score function learning; approximation errors exist in practice
Condition Restrictions: Requires satisfaction of continuity and finite second moment conditions
Connectivity Requirements: Finite second moment conditions implicitly require manifold connectivity

Future Directions

Error Analysis: Quantifying the impact of score function learning errors on LID estimation
Flow Matching Extension: Extending results to flow matching methods
Distribution Extension: Investigating similar results under other noise distributions

In-Depth Evaluation

Strengths

Theoretical Rigor: Provides complete mathematical proofs using advanced differential geometric tools
Practical Value: Provides theoretical foundation for an already high-performing method
Complete Results: Proves not only the Gaussian case but also extends to the uniform distribution case
Clear Presentation: Complex mathematical content is well-organized and easy to understand

Weaknesses

Lack of Experimental Validation: As theoretical work, lacks experimental verification of theoretical predictions
Condition Restrictions: Some assumed conditions may not be fully satisfied in practical applications
Insufficient Error Analysis: Lacks in-depth analysis of error sources in practical applications

Impact

Academic Contribution: Provides important theoretical foundation for the intersection of generative models and manifold learning
Practical Value: Enhances the credibility of FLIPD in practical applications
Inspirational Value: Provides theoretical framework for other geometric analysis methods based on generative models

Applicable Scenarios

These theoretical results apply to:

High-Dimensional Data Analysis: Particularly for data following the manifold hypothesis
Anomaly Detection: Using LID for outlier detection
Generative Model Evaluation: Assessing the capability of generative models to learn data manifolds
Neural Network Analysis: Analyzing geometric properties of network representations

References

The paper cites extensive related work, including:

Kamkari et al. (2024b): Original work proposing FLIPD
Classical LID estimation methods: Levina & Bickel (2004), Facco et al. (2017), etc.
Diffusion model theory: Song et al. (2021), De Bortoli (2022), etc.
Manifold learning related: Lee (2012, 2018) and other differential geometry textbooks

Summary: This is a high-quality theoretical paper that provides rigorous mathematical foundation for the important practical method FLIPD. Although lacking experimental validation, its theoretical contributions are valuable for understanding the relationship between generative models and manifold geometry.