Investigation of parameter uncertainty in clustering using a Gaussian mixture model via jackknife, bootstrap and weighted likelihood bootstrap

FOS: Computer and information sciences MclustBootstrap Standard errors Mclust Precision 310 Statistics - Computation 01 natural sciences Variance estimation Methodology (stat.ME) mclust; MclustBootstrap; Precision; Standard errors; Variance estimation 0101 mathematics Statistics - Methodology Computation (stat.CO)
DOI: 10.1007/s00180-019-00897-9 Publication Date: 2019-05-28T18:20:22Z
ABSTRACT
Mixture models are a popular tool in model-based clustering. Such a model is often fitted by a procedure that maximizes the likelihood, such as the EM algorithm. At convergence, the maximum likelihood parameter estimates are typically reported, but in most cases little emphasis is placed on the variability associated with these estimates. In part this may be due to the fact that standard errors are not directly calculated in the model-fitting algorithm, either because they are not required to fit the model, or because they are difficult to compute. The examination of standard errors in model-based clustering is therefore typically neglected. The widely used R package mclust has recently introduced bootstrap and weighted likelihood bootstrap methods to facilitate standard error estimation. This paper provides an empirical comparison of these methods (along with the jackknife method) for producing standard errors and confidence intervals for mixture parameters. These methods are illustrated and contrasted in both a simulation study and in the traditional Old Faithful data set and Thyroid data set.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (48)
CITATIONS (33)