2025-11-16T08:55:12.135200

On Convolutions, Intrinsic Dimension, and Diffusion Models

Leung, Hosseinzadeh, Loaiza-Ganem
The manifold hypothesis asserts that data of interest in high-dimensional ambient spaces, such as image data, lies on unknown low-dimensional submanifolds. Diffusion models (DMs) -- which operate by convolving data with progressively larger amounts of Gaussian noise and then learning to revert this process -- have risen to prominence as the most performant generative models, and are known to be able to learn distributions with low-dimensional support. For a given datum in one of these submanifolds, we should thus intuitively expect DMs to have implicitly learned its corresponding local intrinsic dimension (LID), i.e. the dimension of the submanifold it belongs to. Kamkari et al. (2024b) recently showed that this is indeed the case by linking this LID to the rate of change of the log marginal densities of the DM with respect to the amount of added noise, resulting in an LID estimator known as FLIPD. LID estimators such as FLIPD have a plethora of uses, among others they quantify the complexity of a given datum, and can be used to detect outliers, adversarial examples and AI-generated text. FLIPD achieves state-of-the-art performance at LID estimation, yet its theoretical underpinnings are incomplete since Kamkari et al. (2024b) only proved its correctness under the highly unrealistic assumption of affine submanifolds. In this work we bridge this gap by formally proving the correctness of FLIPD under realistic assumptions. Additionally, we show that an analogous result holds when Gaussian convolutions are replaced with uniform ones, and discuss the relevance of this result.
academic

On Convolutions, Intrinsic Dimension, and Diffusion Models

Basic Information

  • Paper ID: 2506.20705
  • Title: On Convolutions, Intrinsic Dimension, and Diffusion Models
  • Authors: Kin Kwan Leung, Rasa Hosseinzadeh, Gabriel Loaiza-Ganem (Layer 6 AI)
  • Classification: cs.LG cs.AI stat.ML
  • Publication Time/Venue: Transactions on Machine Learning Research (10/2025)
  • Paper Link: https://arxiv.org/abs/2506.20705

Abstract

The manifold hypothesis posits that data of interest (such as image data) in high-dimensional ambient space lies on an unknown low-dimensional submanifold. Diffusion models (DMs) have become the highest-performing generative models by progressively convolving data with increasing Gaussian noise and learning to reverse this process, and are known to be capable of learning distributions with low-dimensional support. For a given data point on these submanifolds, we intuitively expect that DMs have implicitly learned its corresponding local intrinsic dimension (LID), i.e., the dimensionality of the submanifold to which it belongs. Kamkari et al. (2024b) recently proved this to be the case by connecting LID to the rate of change of the DM's log marginal density with respect to the amount of added noise, yielding an LID estimator called FLIPD. FLIPD achieves state-of-the-art performance in LID estimation, but its theoretical foundation is incomplete, as Kamkari et al. (2024b) proved its correctness only under the highly unrealistic assumption of affine submanifolds. This paper bridges this gap by formally proving the correctness of FLIPD under realistic assumptions. Furthermore, we prove that similar results hold when Gaussian convolution is replaced by uniform convolution, and discuss the relevance of this result.

Research Background and Motivation

Problem Definition

The core problem addressed in this paper is to provide a rigorous theoretical foundation for the FLIPD (Flow-based Local Intrinsic Dimension) estimator. Specifically:

  1. Theoretical Deficiency: Although FLIPD proposed by Kamkari et al. demonstrates excellent practical performance, its theoretical proof only holds under the unrealistic assumption of affine submanifolds
  2. Practical Requirement: There is a need to prove FLIPD's correctness on general embedded submanifolds, aligning its theoretical foundation with practical applications

Importance Analysis

Local intrinsic dimension (LID) estimation has important applications in machine learning:

  • Complexity Quantification: Effectively quantifying image complexity
  • Anomaly Detection: Detecting outliers, adversarial examples, and AI-generated text
  • Generalization Prediction: LID estimation of neural network representations can predict generalization performance
  • Memorization Detection: Identifying model memorization phenomena

Limitations of Existing Methods

Traditional LID estimators suffer from the following problems:

  1. High Computational Complexity: Relying on pairwise distance calculations, scaling poorly with dataset size and ambient dimension
  2. Curse of Dimensionality: Performance degradation in high-dimensional spaces
  3. Incomplete Theory: Although FLIPD demonstrates superior performance, its theoretical foundation is weak

Core Contributions

  1. Theoretical Refinement: Formally proves FLIPD's correctness under realistic assumptions, extending it from affine submanifolds to general smooth embedded submanifolds
  2. Result Extension: Proves that similar results hold when Gaussian convolution is replaced by uniform convolution
  3. Mathematical Rigor: Provides complete mathematical proofs, including sophisticated differential geometric analysis
  4. Practical Value: Provides theoretical guarantees for FLIPD's reliability in practical applications

Methodology Details

Core Theoretical Result

The core of this paper is proving that the following key equation holds under general conditions:

LID(x)=D+limδδlogϱN(x,δ)\text{LID}(x) = D + \lim_{\delta \to -\infty} \frac{\partial}{\partial \delta} \log \varrho_N(x, \delta)

where:

  • ϱN(x,δ)\varrho_N(x, \delta) is the convolution of the data distribution with Gaussian noise of log standard deviation δ\delta
  • DD is the ambient space dimension
  • δ\delta \to -\infty corresponds to the limit as noise approaches zero

Main Theorems

Theorem 1 (Gaussian Case): Let MM be a smooth dd-dimensional embedded submanifold in RD\mathbb{R}^D, and pp be a probability density function on MM. For xMx \in M, if pp is continuous at xx, p(x)>0p(x) > 0, and satisfies finite second moment conditions, then:

limδδlogϱN(x,δ)=dD\lim_{\delta \to -\infty} \frac{\partial}{\partial \delta} \log \varrho_N(x, \delta) = d - D

Theorem 2 (Uniform Case): Similar results hold for uniform distribution convolution:

limδδlogϱU(x,δ)=dD\lim_{\delta \to -\infty} \frac{\partial}{\partial \delta} \log \varrho_U(x, \delta) = d - D

Proof Strategy

The core idea of the proof exploits the decomposition properties of Gaussian and uniform densities:

  1. Gaussian Case: Utilizing the relationship ND(xx;0,δ)=(2π)dD2eδ(dD)Nd(xx;0,δ)N_D(x-x'; 0, \delta) = (2\pi)^{\frac{d-D}{2}} e^{\delta(d-D)} N_d(x-x'; 0, \delta)
  2. Uniform Case: Utilizing similar decomposition UD(x;μ,δ)=CDU(CdU)1eδ(dD)Ud(x;μ,δ)U_D(x;\mu, \delta) = C_D^U (C_d^U)^{-1} e^{\delta(d-D)} U_d(x;\mu, \delta)
  3. Limit Analysis: Through refined differential geometric analysis, proving that the limit of the derivative converges to the expected value

Experimental Setup

This paper is primarily theoretical work without large-scale experimental validation. The authors focus on:

  1. Mathematical Proofs: Providing rigorous theoretical analysis
  2. Condition Verification: Ensuring that the proposed conditions are reasonable in practical applications
  3. Extensibility Analysis: Extending results from single submanifolds to disjoint unions of submanifolds

Experimental Results

Theoretical Result Verification

The paper verifies the completeness of the theory through the following corollaries:

Corollary 1: For disjoint unions of submanifolds M=jMjM = \cup_j M_j, under appropriate separation conditions, the results still hold.

Corollary 2: Similar extensions for the uniform case also hold.

Practical Implications

These theoretical results directly imply:

  1. FLIPD Correctness: When the score function is learned perfectly, limδFLIPD(x;δ)=LID(x)\lim_{\delta \to -\infty} \text{FLIPD}(x; \delta) = \text{LID}(x)
  2. Negative Value Interpretation: FLIPD producing negative estimates can only be attributed to imperfect score function learning, not theoretical defects

Classification of LID Estimation Methods

  1. Traditional Methods: Statistical estimators based on pairwise distances or angles (Fukunaga & Olsen, 1971; Levina & Bickel, 2004, etc.)
  2. Generative Model Methods:
    • Variational autoencoder approaches (Zheng et al., 2022)
    • Normalizing flow approaches (Tempczyk et al., 2022)
    • Diffusion model approaches (Stanczuk et al., 2024; Horvat & Pfister, 2024)

Comparison with FLIPD

  • Stanczuk et al. Method: Also based on diffusion models but requires more function evaluations
  • Horvat & Pfister Method: Requires modifying the DM training process
  • FLIPD Advantages: Compatible with off-the-shelf state-of-the-art DMs (e.g., Stable Diffusion)

Conclusions and Discussion

Main Conclusions

  1. Theoretical Refinement: Successfully extends FLIPD's theoretical foundation from affine submanifolds to general smooth embedded submanifolds
  2. Method Generality: Proves similar results for both Gaussian and uniform convolution cases
  3. Practical Value: Provides mathematical guarantees for FLIPD's reliability in practical applications

Limitations

  1. Perfect Score Function Assumption: Theoretical results assume perfect score function learning; approximation errors exist in practice
  2. Condition Restrictions: Requires satisfaction of continuity and finite second moment conditions
  3. Connectivity Requirements: Finite second moment conditions implicitly require manifold connectivity

Future Directions

  1. Error Analysis: Quantifying the impact of score function learning errors on LID estimation
  2. Flow Matching Extension: Extending results to flow matching methods
  3. Distribution Extension: Investigating similar results under other noise distributions

In-Depth Evaluation

Strengths

  1. Theoretical Rigor: Provides complete mathematical proofs using advanced differential geometric tools
  2. Practical Value: Provides theoretical foundation for an already high-performing method
  3. Complete Results: Proves not only the Gaussian case but also extends to the uniform distribution case
  4. Clear Presentation: Complex mathematical content is well-organized and easy to understand

Weaknesses

  1. Lack of Experimental Validation: As theoretical work, lacks experimental verification of theoretical predictions
  2. Condition Restrictions: Some assumed conditions may not be fully satisfied in practical applications
  3. Insufficient Error Analysis: Lacks in-depth analysis of error sources in practical applications

Impact

  1. Academic Contribution: Provides important theoretical foundation for the intersection of generative models and manifold learning
  2. Practical Value: Enhances the credibility of FLIPD in practical applications
  3. Inspirational Value: Provides theoretical framework for other geometric analysis methods based on generative models

Applicable Scenarios

These theoretical results apply to:

  1. High-Dimensional Data Analysis: Particularly for data following the manifold hypothesis
  2. Anomaly Detection: Using LID for outlier detection
  3. Generative Model Evaluation: Assessing the capability of generative models to learn data manifolds
  4. Neural Network Analysis: Analyzing geometric properties of network representations

References

The paper cites extensive related work, including:

  • Kamkari et al. (2024b): Original work proposing FLIPD
  • Classical LID estimation methods: Levina & Bickel (2004), Facco et al. (2017), etc.
  • Diffusion model theory: Song et al. (2021), De Bortoli (2022), etc.
  • Manifold learning related: Lee (2012, 2018) and other differential geometry textbooks

Summary: This is a high-quality theoretical paper that provides rigorous mathematical foundation for the important practical method FLIPD. Although lacking experimental validation, its theoretical contributions are valuable for understanding the relationship between generative models and manifold geometry.