Skip to yearly menu bar Skip to main content

Workshop: Your Model is Wrong: Robustness and misspecification in probabilistic modeling

Inferior Clusterings in Misspecified Gaussian Mixture Models

Siva Rajesh Kasa · Vaibhav Rajan


Gaussian Mixture Model (GMM) is a widely used probabilistic model for clustering. In many practical settings, the true data distribution, which is unknown, may be non-Gaussian and may be contaminated by noise or outliers. In such cases, clustering may still be done with a misspecified GMM. However, this may lead to incorrect classification of the underlying subpopulations. In this work, we examine the performance of both Expectation Maximization (EM) and Gradient Descent (GD) on unconstrained Gaussian Mixture Models when there is misspecification. Our simulation study reveals a previously unreported class of \textit{inferior} clustering solutions, different from spurious solutions, that occurs due to asymmetry in the fitted component variances.

Chat is not available.