Unveiling low-dimensional patterns induced by convex non-differentiable regularizers
Hejný, Wallin, Bogdan et al.
Popular regularizers with non-differentiable penalties, such as Lasso, Elastic Net, Generalized Lasso, or SLOPE, reduce the dimension of the parameter space by inducing sparsity or clustering in the estimators' coordinates. In this paper, we focus on linear regression and explore the asymptotic distributions of the resulting low-dimensional patterns when the number of regressors $p$ is fixed, the number of observations $n$ goes to infinity, and the penalty function increases at the rate of $\sqrt{n}$. While the asymptotic distribution of the rescaled estimation error can be derived by relatively standard arguments, convergence of patterns requires a separate proof, which is yet missing from the literature, even for the simplest case of Lasso. To fill this gap, we use the Hausdorff distance as a suitable mode of convergence for subdifferentials, resulting in the desired pattern convergence. Furthermore, we derive the exact limiting probability of recovering the true model pattern. This probability goes to 1 if and only if the penalty scaling constant diverges to infinity and the regularizer-specific asymptotic irrepresentability condition is satisfied. We then propose simple two-step procedures that asymptotically recover the model patterns, irrespective of whether the irrepresentability condition holds or not.
Interestingly, our theory shows that Fused Lasso cannot reliably recover its own clustering pattern, even for independent regressors. It also demonstrates how this problem can be resolved by "concavifying" the Fused Lasso penalty coefficients. Additionally, sampling from the asymptotic error distribution facilitates comparisons between different regularizers. We provide short simulation studies showcasing an illustrative comparison between the asymptotic properties of Lasso, Fused Lasso, and SLOPE.
본 논문은 비미분가능 페널티 항을 가진 인기 있는 정규화기(예: Lasso, Elastic Net, Generalized Lasso 또는 SLOPE)의 선형 회귀에서의 점근적 성질을 연구한다. 이러한 정규화기는 추정기 좌표에서 희소성 또는 군집화를 유도하여 매개변수 공간의 차원을 감소시킨다. 본 논문은 고정된 회귀 변수 수 p, 관측 수 n이 무한대로 수렴하고, 페널티 함수가 √n 속도로 증가하는 점근 분포에 초점을 맞춘다. 재조정된 추정 오차의 점근 분포는 상대적으로 표준적인 논증을 통해 도출될 수 있지만, 패턴 수렴은 별도의 증명이 필요하며, 이는 문헌에서 여전히 부족하다. 본 논문은 Hausdorff 거리를 부분미분 수렴의 적절한 수렴 모드로 사용하여 필요한 패턴 수렴을 달성하고, 참 모델 패턴 복구의 정확한 극한 확률을 도출한다.