Ph.D. Theses

Approximating Covariance Matrices Using Low Rank Perturbations with Applications to Accent Identification and Social Network Clustering

By Jonathan Purnell
Advisor: Malik Magdon-Ismail
August 4, 2010

In this work, we present a new model, the Low-Rank Gaussian Mixture Model (LRGMM), for modeling data which can be extended to identifying partitions or overlapping clusters. This model is motivated by the effectiveness, yet limited scalability, of the Gaussian Mixture Model (GMM) for the problem of accent identification.

The curse of dimensionality that arises in calculating the covariance matrices of the GMM is countered by using low-rank perturbed diagonal matrices. We also demonstrate the LRGMM for finding communities in social networks. We see that the efficiency of the LRGMM allows us to process larger networks than alternative approaches. Altogether, the LRGMM experiments reveal it to be a efficient and highly applicable tool for working with large high-dimensional datasets.

Return to main PhD Theses page