2025-11-10T02:47:56.247933

Strong consistency of pseudo-likelihood parameter estimator for univariate Gaussian mixture models

Lember, Kangro, Kuljus

We consider a new method for estimating the parameters of univariate Gaussian mixture models. The method relies on a nonparametric density estimator $\hat{f}_n$ (typically a kernel estimator). For every set of Gaussian mixture components, $\hat{f}_n$ is used to find the best set of mixture weights. That set is obtained by minimizing the $L_2$ distance between $\hat{f}_n$ and the Gaussian mixture density with the given component parameters. The densities together with the obtained weights are then plugged in to the likelihood function, resulting in the so-called pseudo-likelihood function. The final parameter estimators are the parameter values that maximize the pseudo-likelihood function together with the corresponding weights. The advantages of the pseudo-likelihood over the full likelihood are: 1) its arguments are the means and variances only, mixture weights are also functions of the means and variances; 2) unlike the likelihood function, it is always bounded above. Thus, the maximizer of the pseudo-likelihood function -- referred to as the pseudo-likelihood estimator -- always exists. In this article, we prove that the pseudo-likelihood estimator is strongly consistent.

academic

Strong consistency of pseudo-likelihood parameter estimator for univariate Gaussian mixture models

基本信息

论文ID: 2510.14482
标题: Strong consistency of pseudo-likelihood parameter estimator for univariate Gaussian mixture models
作者: Jüri Lember, Raul Kangro, Kristi Kuljus (爱沙尼亚塔尔图大学数学与统计学院)
分类: math.ST stat.TH
发表时间: 2025年10月16日
论文链接: https://arxiv.org/abs/2510.14482

摘要

本文提出了一种估计单变量高斯混合模型参数的新方法。该方法基于非参数密度估计器 $\hat{f}_n$ （通常为核估计器）。对于每组给定的高斯混合分量参数，通过最小化 $\hat{f}_n$ 与高斯混合密度之间的 $L_2$ 距离来寻找最优的混合权重。然后将获得的权重与密度一起代入似然函数，形成所谓的伪似然函数。最终的参数估计器是使伪似然函数最大化的参数值及其对应权重。相比于完整似然，伪似然的优势在于：1）其参数仅为均值和方差，混合权重也是均值和方差的函数；2）与似然函数不同，它总是有界的。因此，伪似然函数的最大化器——伪似然估计器总是存在的。本文证明了伪似然估计器的强一致性。

研究背景与动机

问题背景

高斯混合模型的似然无界性问题：高斯混合模型的似然函数是无界的，这是一个众所周知的问题。当某些分量的方差趋向于零时，似然函数可能趋向于无穷大。
现有解决方案的局限性：
- 限制参数空间
- 使用筛子方法
- 惩罚最大似然估计
- 贝叶斯方法
- 轮廓似然等
这些方法通常需要对方差施加限制或惩罚项。
研究动机：
- 提供一种不需要对参数施加任何限制的方法
- 保持与标准最大似然估计的相似性
- 确保估计器的存在性和一致性

为什么重要

高斯混合模型在统计学和机器学习中应用广泛
无界似然问题阻碍了标准MLE的应用
需要理论上可靠且实际可行的估计方法

核心贡献

提出伪似然方法：一种新的参数估计方法，通过 $L_2$ 距离最小化确定混合权重，然后构造伪似然函数。
证明强一致性：在i.i.d.样本假设下，证明了伪似然估计器的强一致性： $\hat{\theta}_n \xrightarrow{a.s.} \theta^*$ 和 $v_n(\hat{\theta}_n) \xrightarrow{a.s.} w^*$ 。
无参数限制：方法不需要对方差施加下界限制或其他约束条件。
理论框架：建立了处理无界均值、消失或无界方差情况的完整理论框架。

方法详解

任务定义

给定来自 $k$ 分量单变量高斯混合分布的i.i.d.观测 $Y_1, \ldots, Y_n$ ，目标是估计：

分量参数： $\theta_i = (\mu_i, \sigma_i)$ ， $i = 1, \ldots, k$
混合权重： $w_i > 0$ ， $\sum_{i=1}^k w_i = 1$

真实密度为： $f(\cdot) = \sum_{i=1}^k w_i^* g(\theta_i^*, \cdot)$

模型架构

第一步：权重估计

对于给定的参数 $\theta = (\theta_1, \ldots, \theta_k)$ ，通过最小化 $L_2$ 距离确定权重：

$v_n(\theta) := \arg \inf_{w \in S_k} \|\hat{f}_n(\cdot) - \sum_{i=1}^k w_i g(\theta_i, \cdot)\|$

其中 $S_k$ 是 $(k-1)$ 维单纯形， $\hat{f}_n$ 是非参数密度估计器。

第二步：伪似然构造

将获得的权重代入似然函数：

$L_n(\theta) := \prod_{t=1}^n \left( \sum_{i=1}^k v_{n,i}(\theta) g(\theta_i, Y_t) \right)$

对数伪似然函数： $\ell_n(\theta) := \frac{1}{n} \sum_{t=1}^n \ln\left( v_n(\theta)g(\theta, Y_t) \right)$

第三步：参数估计

伪似然估计器定义为： $\hat{\theta}_n \text{ 满足 } \ell_n(\hat{\theta}_n) \geq \sup_{\theta \in \Theta_o} \ell_n(\theta) - \epsilon_n$

其中 $\epsilon_n \searrow 0$ 。

技术创新点

两步估计策略：
- 第一步用 $L_2$ 距离估计权重
- 第二步用似然方法估计分量参数
- 这种组合确保了目标函数的有界性
权重的唯一性：虽然权重 $v_n(\theta)$ 可能不唯一，但密度 $v_n(\theta)g(\theta, \cdot)$ 是唯一的（引理2.1）。
参数空间的处理：通过等价类概念处理参数的不可识别性（如排列不变性）。