Latent variable models are widely used to capture the underlying structures of the data, for example, Gaussian mixture models for speech recognition, stochastic block models for community detection and topic models for information retrieval. While alternative minimization based algorithms such as EM algorithm and Lloyd's algorithm performs well in practice, there has been little theoretical advancement in explaining the effectiveness of these algorithms. In this thesis, we investigate the performance of Lloyd's algorithm and EM algorithm on clustering two-mixture of Gaussians. With an initializer slightly better than random guess, we are able to show the linear converge of Lloyd's and EM iterations to the statistical optimal estimator. These results shed light on the global convergence of more general non-convex optimizations.
We generalized the results to arbitrary number of sub-Gaussian mixtures. Motivated by the Lloyd's algorithm, we propose new algorithms for other latent variable models including sparse gaussian mixture model, stochastic block model. biclustering model and Dawid-Skene model. The proposed algorithms are computationally efficient and shown to be rate-optimal under mild signal-to-noise ratio conditions. The highlight of our theoretical analysis is to develop new proof techniques to handle the dependency between iterations, which can be applied to other iterative algorithms with explicit iteration formulas.