Strong consistency of pseudo-likelihood parameter estimator for univariate Gaussian mixture models
Lember, Kangro, Kuljus
We consider a new method for estimating the parameters of univariate Gaussian mixture models. The method relies on a nonparametric density estimator $\hat{f}_n$ (typically a kernel estimator). For every set of Gaussian mixture components, $\hat{f}_n$ is used to find the best set of mixture weights. That set is obtained by minimizing the $L_2$ distance between $\hat{f}_n$ and the Gaussian mixture density with the given component parameters. The densities together with the obtained weights are then plugged in to the likelihood function, resulting in the so-called pseudo-likelihood function. The final parameter estimators are the parameter values that maximize the pseudo-likelihood function together with the corresponding weights. The advantages of the pseudo-likelihood over the full likelihood are: 1) its arguments are the means and variances only, mixture weights are also functions of the means and variances; 2) unlike the likelihood function, it is always bounded above. Thus, the maximizer of the pseudo-likelihood function -- referred to as the pseudo-likelihood estimator -- always exists. In this article, we prove that the pseudo-likelihood estimator is strongly consistent.
academic
Strong consistency of pseudo-likelihood parameter estimator for univariate Gaussian mixture models
This paper proposes a novel method for estimating parameters of univariate Gaussian mixture models. The method is based on a nonparametric density estimator f^n (typically a kernel estimator). For each given set of Gaussian mixture component parameters, optimal mixing weights are found by minimizing the L2 distance between f^n and the Gaussian mixture density. The obtained weights are then substituted together with the density into the likelihood function, forming the so-called pseudo-likelihood function. The final parameter estimator is the parameter value and corresponding weights that maximize the pseudo-likelihood function. Compared to the complete likelihood, the pseudo-likelihood has two advantages: 1) its parameters consist only of means and variances, with mixing weights also being functions of means and variances; 2) unlike the likelihood function, it is always bounded. Therefore, the maximizer of the pseudo-likelihood function—the pseudo-likelihood estimator—always exists. This paper proves the strong consistency of the pseudo-likelihood estimator.
Unboundedness of likelihood for Gaussian mixture models: The likelihood function of Gaussian mixture models is unbounded, a well-known problem. When the variances of certain components approach zero, the likelihood function may tend to infinity.
Limitations of existing solutions:
Restricting the parameter space
Using sieve methods
Penalized maximum likelihood estimation
Bayesian methods
Profile likelihood, etc.
These methods typically require imposing restrictions or penalty terms on variances.
Research motivation:
Provide a method that does not require any restrictions on parameters
Maintain similarity with standard maximum likelihood estimation
Proposes the pseudo-likelihood method: A novel parameter estimation method that determines mixing weights through L2 distance minimization and then constructs the pseudo-likelihood function.
Proves strong consistency: Under i.i.d. sample assumptions, proves the strong consistency of the pseudo-likelihood estimator: θ^na.s.θ∗ and vn(θ^n)a.s.w∗.
No parameter restrictions: The method does not require imposing lower bounds on variances or other constraints.
Theoretical framework: Establishes a complete theoretical framework for handling unbounded means, vanishing or unbounded variances.
Proposition 3.1: Proves the existence of constants 0<u<U<∞ and N<∞ such that for sufficiently large n, at least one component i(n) satisfies:
∣μi(n)n∣<N,u≤σi(n)n≤U
This ensures that θ^n eventually belongs to a bounded parameter space Θo(u,U,N).