2025-11-25T02:22:17.580847

Optimal Bounds for Tyler's M-Estimator for Elliptical Distributions

Lau, Ramachandran
A fundamental problem in statistics is estimating the shape matrix of an Elliptical distribution. This generalizes the familiar problem of Gaussian covariance estimation, for which the sample covariance achieves optimal estimation error. For Elliptical distributions, Tyler proposed a natural M-estimator and showed strong statistical properties in the asymptotic regime, independent of the underlying distribution. Numerical experiments show that this estimator performs very well, and that Tyler's iterative procedure converges quickly to the estimator. Franks and Moitra recently provided the first distribution-free error bounds in the finite sample setting, as well as the first rigorous convergence analysis of Tyler's iterative procedure. However, their results exceed the sample complexity of the Gaussian setting by a $\log^{2} d$ factor. We close this gap by proving optimal sample threshold and error bounds for Tyler's M-estimator for all Elliptical distributions, fully matching the Gaussian result. Moreover, we recover the algorithmic convergence even at this lower sample threshold. Our approach builds on the operator scaling connection of Franks and Moitra by introducing a novel pseudorandom condition, which we call $\infty$-expansion. We show that Elliptical distributions satisfy $\infty$-expansion at the optimal sample threshold, and then prove a novel scaling result for inputs satisfying this condition.
academic

Optimal Bounds for Tyler's M-Estimator for Elliptical Distributions

Basic Information

  • Paper ID: 2510.13751
  • Title: Optimal Bounds for Tyler's M-Estimator for Elliptical Distributions
  • Authors: Lap Chi Lau (University of Waterloo), Akshay Ramachandran (University of British Columbia)
  • Classification: math.ST cs.LG stat.TH
  • Publication Date: May 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.13751

Abstract

Shape matrix estimation for elliptical distributions is a fundamental problem in statistics that generalizes Gaussian covariance estimation. Tyler proposed a natural M-estimator and proved strong statistical properties in the asymptotic regime. Franks and Moitra recently provided the first distribution-free finite-sample error bounds, but their results incur an additional log2d\log^2 d factor in sample complexity compared to the Gaussian case. This paper establishes optimal sample thresholds and error bounds for Tyler's M-estimator by introducing a novel pseudo-random condition called \infty-expansion, which completely matches Gaussian results and recovers algorithmic convergence at lower sample thresholds.

Research Background and Motivation

Problem Background

  1. Core Problem: Estimating the shape matrix of elliptical distributions, an important generalization of high-dimensional covariance estimation
  2. Practical Significance:
    • Elliptical distributions encompass important special cases including multivariate Gaussian and t-distributions
    • For heavy-tailed distributions, the covariance matrix may not exist, but the shape matrix still captures geometric properties
    • Widely applicable in finance, signal processing, and related fields

Limitations of Existing Methods

  1. Limitations of Sample Covariance: Poor performance on heavy-tailed distributions, may not even exist
  2. Theoretical Deficiencies of Tyler's Estimator:
    • Tyler (1987) only provided asymptotic guarantees
    • Franks and Moitra (2020) finite-sample bounds contain an extra log2d\log^2 d factor
    • Sample complexity of ndlog2dn \gtrsim d\log^2 d exceeds the optimal ndn \gtrsim d for the Gaussian case

Research Motivation

This paper addresses the question: Can Tyler's estimator achieve the same optimal guarantees on elliptical distributions as Gaussian covariance estimation, or is shape estimation inherently more difficult?

Core Contributions

  1. Optimal Sample Complexity: Proves that Tyler's M-estimator achieves relative operator norm error ε\varepsilon with sample size ndε2n \gtrsim \frac{d}{\varepsilon^2}
  2. Optimal Error Bounds: Completely matches Gaussian lower bounds, establishing tightness of results
  3. Algorithmic Convergence: Recovers linear convergence of Tyler's iterative process at optimal sample threshold ndn \gtrsim d
  4. Novel Theoretical Tools: Introduces \infty-expansion condition, providing stronger analysis tools for frame scaling
  5. Technical Innovation: Improves two key components in the Franks-Moitra method, eliminating the logd\log d factor

Methodology Details

Task Definition

Input: nn samples x1,,xnRdx_1, \ldots, x_n \in \mathbb{R}^d from elliptical distribution E(Σ,u)E(\Sigma, u)Output: Estimate Σ^\hat{\Sigma} of shape matrix Σ\SigmaObjective: Minimize relative operator norm error IdΣ1/2Σ^1Σ1/2op\|I_d - \Sigma^{1/2}\hat{\Sigma}^{-1}\Sigma^{1/2}\|_{op}

Elliptical Distributions and Tyler's Estimator

Elliptical Distribution Definition: X:=Σ1/2VuX := \Sigma^{1/2}V \cdot u where VSd1V \sim S^{d-1} is a uniformly random unit vector, and uRu \in \mathbb{R} is an independent scalar random variable.

Tyler's M-Estimator: The unique solution Σ^\hat{\Sigma} satisfying: dnj=1nxjxjTxjTΣ^1xj=Σ^,Tr[Σ^]=d\frac{d}{n}\sum_{j=1}^n \frac{x_jx_j^T}{x_j^T\hat{\Sigma}^{-1}x_j} = \hat{\Sigma}, \quad \text{Tr}[\hat{\Sigma}] = d

Core Technical Framework

1. Frame Scaling Connection

Tyler's estimator is equivalent to the frame scaling problem:

  • Frame: V={v1,,vn}Rd×nV = \{v_1, \ldots, v_n\} \in \mathbb{R}^{d \times n}
  • Objective: Find left and right scalings LRd×dL \in \mathbb{R}^{d \times d} and Rdiag(n)R \in \text{diag}(n) such that V=LVRV' = LVR satisfies:
    • Equi-isometry: VVT=s(V)dIdV'V'^T = \frac{s(V')}{d}I_d
    • Equi-norm: vj22=s(V)n\|v'_j\|_2^2 = \frac{s(V')}{n}

2. ∞-Expansion Condition

Definition: Frame VV satisfies (1λ)(1-\lambda)-\infty-expansion if: y1n,y1:j=1nyjvjvjTops(V)(1λ)d\forall y \perp \mathbf{1}_n, \|y\|_\infty \leq 1: \left\|\sum_{j=1}^n y_j v_j v_j^T\right\|_{op} \leq \frac{s(V)(1-\lambda)}{d}

This is a stronger condition than quantum expansion, with key improvements:

  • Constraint strengthened from y21\|y\|_2 \leq 1 to y1\|y\|_\infty \leq 1
  • Output changed from Frobenius norm to operator norm

3. Pseudo-Random Condition

Definition: Frame VV is (αmin,αmax,β)(\alpha_{\min}, \alpha_{\max}, \beta)-pseudo-random if: B=βn:βαmindIdVBVBTβαmaxdId\forall |B| = \beta n: \beta\frac{\alpha_{\min}}{d}I_d \preceq V_BV_B^T \preceq \beta\frac{\alpha_{\max}}{d}I_d

Main Theoretical Results

Theorem 1.1 (Sample Complexity): When ndε2n \gtrsim \frac{d}{\varepsilon^2} and ε\varepsilon is a small constant, Tyler's M-estimator satisfies: IdΣ1/2Σ^1Σ1/2opε\|I_d - \Sigma^{1/2}\hat{\Sigma}^{-1}\Sigma^{1/2}\|_{op} \leq \varepsilon with probability at least 1exp(Ω(ε2n))1 - \exp(-\Omega(\varepsilon^2 n)).

Theorem 1.2 (Algorithmic Convergence): When ndn \gtrsim d, the TT-th iterate Σ(T)\Sigma^{(T)} of Tyler's iterative process satisfies: IdΣ^1/2Σ(T),1Σ^1/2Fδ\|I_d - \hat{\Sigma}^{1/2}\Sigma^{(T),-1}\hat{\Sigma}^{1/2}\|_F \leq \delta within TlogdetΣ+d+log(1/δ)T \lesssim |\log \det \Sigma| + d + \log(1/\delta) iterations.

Technical Innovation Points

1. ∞-Expansion vs Quantum Expansion

  • Quantum Expansion (Franks-Moitra): Requires y21\|y\|_2 \leq 1, outputs Frobenius norm bounds
  • ∞-Expansion (this work): Requires y1\|y\|_\infty \leq 1, outputs operator norm bounds
  • Advantage: Stronger conditions yield tighter analysis, eliminating the logd\log d factor

2. Improved Frame Scaling Analysis

Theorem 2.12: If frame VV is ε\varepsilon-doubly balanced and satisfies (1λ)(1-\lambda)-\infty-expansion, when λ2ε\lambda^2 \gtrsim \varepsilon: LIdopελ\|L - I_d\|_{op} \lesssim \frac{\varepsilon}{\lambda}

This improves upon Kwok et al.'s result by a logd\log d factor.

3. ∞-Expansion for Random Frames

Theorem 2.13: For v1,,vnSd1v_1, \ldots, v_n \sim S^{d-1}, when ndn \gtrsim d, frame VV satisfies (1λ)(1-\lambda)-\infty-expansion with probability 1exp(Ω(n))\geq 1-\exp(-\Omega(n)), where λΩ(1)\lambda \geq \Omega(1).

Experimental Setup

This is primarily a theoretical work without extensive numerical experiments. The authors mention that Tyler's estimator and iterative process perform well in numerical experiments, but the focus is on the rigor of theoretical analysis.

Experimental Results

Theoretical Result Verification

  1. Optimality: Sample complexity ndε2n \gtrsim \frac{d}{\varepsilon^2} matches Gaussian lower bounds
  2. Tightness: Relative operator norm error bounds are tight
  3. Algorithmic Efficiency: Iteration complexity O(logdetΣ+d+log(1/δ))O(|\log \det \Sigma| + d + \log(1/\delta)) is optimal

Quantification of Technical Improvements

  • Sample Complexity: Improved from ndlog2dn \gtrsim d\log^2 d to ndn \gtrsim d
  • Error Bounds: Eliminated the logd\log d factor
  • Algorithmic Convergence: Maintains linear convergence at lower sample thresholds

Elliptical Distribution Estimation

  1. Tyler (1987): Proposed M-estimator, proved asymptotic properties
  2. Soloveychik & Wiesel (2014): Optimal error under Frobenius norm, but condition-number dependent
  3. Regularization Methods: Computationally efficient but lack theoretical guarantees

Frame Scaling Theory

  1. Gurvits et al. (2019): Polynomial-time algorithm for operator scaling
  2. Kwok et al. (2021): Scaling bounds under quantum expansion
  3. Paulsen Problem: Classical problem in frame theory

Technical Connections

This work builds upon Franks-Moitra's operator scaling connection but achieves key improvements through the introduction of the stronger \infty-expansion condition.

Conclusions and Discussion

Main Conclusions

  1. Theoretical Completeness: First proof that Tyler's M-estimator achieves information-theoretically optimal bounds on elliptical distributions
  2. Method Uniformity: Shape estimation for elliptical distributions has the same sample complexity as Gaussian covariance estimation
  3. Algorithmic Practicality: Tyler's iterative process converges rapidly at optimal sample thresholds

Technical Contributions

  • \infty-expansion provides new analytical tools for frame scaling
  • Proof techniques may apply to related problems (Paulsen problem, tensor normal models)

Future Directions

  1. Paulsen Problem: Use similar techniques to prove optimal distance bounds ε\varepsilon
  2. Tensor Normal Models: Extend to covariance estimation for higher-order tensors
  3. Computational Complexity: Study exact computational complexity of Tyler's iteration

In-Depth Evaluation

Strengths

  1. Theoretical Rigor: Completely resolves a long-standing open problem with tight optimal bounds
  2. Technical Innovation: Introduction of \infty-expansion condition is a key insight
  3. Methodological Completeness: Simultaneously addresses both sample complexity and algorithmic convergence
  4. Clarity of Presentation: Clear technical roadmap with well-structured proofs

Limitations

  1. Missing Experimental Validation: Lacks numerical experiments to verify theoretical predictions
  2. Constant Factors: Constants in theoretical bounds may not be sufficiently tight
  3. Limited Scope: Restricted to elliptical distributions; extension to more general heavy-tailed distributions unclear

Impact Assessment

  1. Theoretical Significance: Resolves an important open problem in statistical learning theory
  2. Practical Value: Provides theoretical foundation for covariance estimation on heavy-tailed data
  3. Methodological Value: \infty-expansion technique may have broader applications

Applicable Scenarios

  1. Financial Data Analysis: Portfolio optimization with heavy-tailed distributions
  2. Signal Processing: Robust covariance estimation
  3. Machine Learning: Learning geometric structure of high-dimensional data

References

This work builds primarily on the following key contributions:

  • Tyler (1987): Original M-estimator
  • Franks & Moitra (2020): Operator scaling connection
  • Kwok et al. (2021): Quantum expansion theory
  • Vershynin (2010): Random matrix theory tools