* Faculty       * Staff       * Contact       * Institute Directory
* Undergraduate       * Graduate       * Institute Admissions: Undergraduate | Graduate      
* Events       * Institute Events      
* Lab Manual       * Institute Computing      
No Menu Selected

* Research

Ph.D. Theses

Application Aspects of Data Mining Analysis on Evolving Graphs

By Anurat Chapanond
Advisor: Mukkai Krishnamoorthy
December 5, 2007

With the recent discovery that many real-world networks have many things in common, and in particular in how such networks evolve, have increased interest in the study of evolution of networks. Adding the dimension of time to the analysis of graphs, evolving graphs present opportunities and challenges for analysts to extract valued information.

In this thesis, the application aspects of evolving graphs are explored. We introduce the Evolving Graph Markup Language (EGML), an XML application for representing evolving graphs and related results. Along with EGML, we provide a software tool for the study of evolving graphs. Metrics, the common tool for graph analysis, are explored. We study classic problems of data mining such as ranking, clustering, and prediction problems and visualization of evolving graphs, specifically the drawing of an evolving graph.

Through our research, we explore techniques that improve the computation time required to calculate metrics for evolving graphs. Further, we develop new techniques that address ranking and clustering problems of evolving graphs, and provide new models and methods for prediction problems of evolving graphs. Finally, we explore new evolving graph drawing techniques.

The results on metric experiments show that using update technique on metrics significantly improve computation time. In ranking problem experiments, utilizing different metrics in ranking algorithm yields different results and PageRank produces the closest rank result to the original rank of Eurovision competition data. In clustering problem experiments, different clustering algorithms perform well on different quality measures. The single linkage clustering produces the highest coverage quality measure, PageRank clustering produces the highest conductance quality measure, and MCL produces the highest performance quality measure. In prediction problem experiments, our methods for each model all predict better than random prediction. And last, our evolving graph drawing techniques reduce the vertex movements between graph instances, so that evolving graph can be viewed with smooth transition.

* Return to main PhD Theses page