2025-11-25T02:22:17.580847

Optimal Bounds for Tyler's M-Estimator for Elliptical Distributions

Lau, Ramachandran

A fundamental problem in statistics is estimating the shape matrix of an Elliptical distribution. This generalizes the familiar problem of Gaussian covariance estimation, for which the sample covariance achieves optimal estimation error. For Elliptical distributions, Tyler proposed a natural M-estimator and showed strong statistical properties in the asymptotic regime, independent of the underlying distribution. Numerical experiments show that this estimator performs very well, and that Tyler's iterative procedure converges quickly to the estimator. Franks and Moitra recently provided the first distribution-free error bounds in the finite sample setting, as well as the first rigorous convergence analysis of Tyler's iterative procedure. However, their results exceed the sample complexity of the Gaussian setting by a $\log^{2} d$ factor. We close this gap by proving optimal sample threshold and error bounds for Tyler's M-estimator for all Elliptical distributions, fully matching the Gaussian result. Moreover, we recover the algorithmic convergence even at this lower sample threshold. Our approach builds on the operator scaling connection of Franks and Moitra by introducing a novel pseudorandom condition, which we call $\infty$-expansion. We show that Elliptical distributions satisfy $\infty$-expansion at the optimal sample threshold, and then prove a novel scaling result for inputs satisfying this condition.

academic

타원 분포에 대한 Tyler의 M-추정량의 최적 경계

기본 정보

논문 ID: 2510.13751
제목: Optimal Bounds for Tyler's M-Estimator for Elliptical Distributions
저자: Lap Chi Lau (University of Waterloo), Akshay Ramachandran (University of British Columbia)
분류: math.ST cs.LG stat.TH
발표 시간: 2025년 5월 (arXiv 사전인쇄본)
논문 링크: https://arxiv.org/abs/2510.13751

초록

타원 분포의 형태 행렬(shape matrix) 추정은 통계학의 기본 문제이며, 가우스 공분산 추정 문제를 일반화합니다. Tyler는 자연스러운 M-추정량을 제안했고 점근적 경우에 강한 통계적 성질을 증명했습니다. Franks와 Moitra는 최근 유한 표본 경우에 대한 첫 번째 분포 무관 오차 경계를 제공했지만, 그들의 결과는 표본 복잡도에서 가우스 경우보다 $\log^2 d$ 인수만큼 더 많습니다. 본 논문은 새로운 의사 무작위 조건인 $\infty$ -expansion을 도입하여 Tyler M-추정량의 최적 표본 임계값과 오차 경계를 증명하며, 가우스 결과와 완전히 일치하고 더 낮은 표본 임계값에서 알고리즘 수렴성을 복구합니다.

연구 배경 및 동기

문제 배경

핵심 문제: 타원 분포의 형태 행렬 추정, 이는 고차원 분포 공분산 추정의 중요한 일반화
실제 의의:
- 타원 분포는 다변량 가우스 분포 및 t-분포 등 중요한 특수한 경우를 포함
- 무거운 꼬리 분포의 경우, 공분산 행렬이 존재하지 않을 수 있지만 형태 행렬은 여전히 기하학적 성질을 포착
- 금융, 신호 처리 등 분야에서 광범위한 응용

기존 방법의 한계

표본 공분산의 한계: 무거운 꼬리 분포에서 성능이 좋지 않으며, 존재하지 않을 수도 있음
Tyler 추정량의 이론적 결함:
- Tyler(1987)는 점근적 보장만 제공
- Franks와 Moitra(2020)의 유한 표본 경계는 $\log^2 d$ 의 추가 인수 존재
- 표본 복잡도는 $n \gtrsim d\log^2 d$ 로, 가우스 경우의 최적값 $n \gtrsim d$ 를 초과

연구 동기

본 논문은 다음 질문에 답하고자 합니다: Tyler 추정량이 타원 분포에서 가우스 공분산 추정과 동일한 최적 보장을 달성할 수 있는가, 아니면 형태 추정이 본질적으로 더 어려운가?

핵심 기여

최적 표본 복잡도: 표본 수 $n \gtrsim \frac{d}{\varepsilon^2}$ 일 때 Tyler M-추정량이 상대 연산자 범수 오차 $\varepsilon$ 를 달성함을 증명
최적 오차 경계: 가우스 경우의 하한과 완전히 일치하여 결과의 타이트함을 증명
알고리즘 수렴성: 최적 표본 임계값 $n \gtrsim d$ 에서 Tyler 반복 과정의 선형 수렴 복구
새로운 이론적 도구: $\infty$ -expansion 조건을 도입하여 frame scaling에 대한 더 강한 분석 도구 제공
기술적 혁신: Franks-Moitra 방법의 두 가지 핵심 구성 요소를 개선하여 $\log d$ 인수 제거

방법 상세 설명

작업 정의

입력: 타원 분포 $E(\Sigma, u)$ 에서의 $n$ 개 표본 $x_1, \ldots, x_n \in \mathbb{R}^d$ 출력: 형태 행렬 $\Sigma$ 의 추정값 $\hat{\Sigma}$ 목표: 상대 연산자 범수 오차 $\|I_d - \Sigma^{1/2}\hat{\Sigma}^{-1}\Sigma^{1/2}\|_{op}$ 최소화

타원 분포 및 Tyler 추정량

타원 분포 정의: $X := \Sigma^{1/2}V \cdot u$ 여기서 $V \sim S^{d-1}$ 은 균등 무작위 단위 벡터이고, $u \in \mathbb{R}$ 은 독립적인 스칼라 무작위 변수입니다.

Tyler M-추정량: 다음 방정식의 유일한 해 $\hat{\Sigma}$ : $\frac{d}{n}\sum_{j=1}^n \frac{x_jx_j^T}{x_j^T\hat{\Sigma}^{-1}x_j} = \hat{\Sigma}, \quad \text{Tr}[\hat{\Sigma}] = d$

핵심 기술 프레임워크

1. Frame Scaling 연결

Tyler 추정량은 frame scaling 문제와 동등합니다:

Frame: $V = \{v_1, \ldots, v_n\} \in \mathbb{R}^{d \times n}$
목표: 좌우 스케일링 $L \in \mathbb{R}^{d \times d}$ $L \in R^{d \times d}$ 와 $R \in \text{diag}(n)$ $R \in diag (n)$ 을 찾아 $V' = LVR$ $V^{'} = L V R$ 이 다음을 만족하도록:
- 등거리성: $V'V'^T = \frac{s(V')}{d}I_d$
- 등범수: $\|v'_j\|_2^2 = \frac{s(V')}{n}$

2. ∞-Expansion 조건

정의: Frame $V$ 가 $(1-\lambda)$ - $\infty$ -expansion을 만족하면: $\forall y \perp \mathbf{1}_n, \|y\|_\infty \leq 1: \left\|\sum_{j=1}^n y_j v_j v_j^T\right\|_{op} \leq \frac{s(V)(1-\lambda)}{d}$

이는 양자 expansion보다 강한 조건으로, 핵심 개선 사항:

제약이 $\|y\|_2 \leq 1$ 에서 $\|y\|_\infty \leq 1$ 로 강화
출력이 Frobenius 범수에서 연산자 범수로 변경

3. 의사 무작위 조건

정의: Frame $V$ 가 $(\alpha_{\min}, \alpha_{\max}, \beta)$ -의사 무작위이면: $\forall |B| = \beta n: \beta\frac{\alpha_{\min}}{d}I_d \preceq V_BV_B^T \preceq \beta\frac{\alpha_{\max}}{d}I_d$

주요 이론적 결과

정리 1.1 (표본 복잡도): $n \gtrsim \frac{d}{\varepsilon^2}$ 이고 $\varepsilon$ 가 작은 상수일 때, Tyler M-추정량은 다음을 만족합니다: $\|I_d - \Sigma^{1/2}\hat{\Sigma}^{-1}\Sigma^{1/2}\|_{op} \leq \varepsilon$ 최소 $1 - \exp(-\Omega(\varepsilon^2 n))$ 의 확률로.

정리 1.2 (알고리즘 수렴): $n \gtrsim d$ 일 때, Tyler 반복 과정의 제 $T$ 단계 반복 $\Sigma^{(T)}$ 는 다음을 만족합니다: $\|I_d - \hat{\Sigma}^{1/2}\Sigma^{(T),-1}\hat{\Sigma}^{1/2}\|_F \leq \delta$ $T \lesssim |\log \det \Sigma| + d + \log(1/\delta)$ 단계 내에서 달성됩니다.