Ph.D. Theses

Using Parallel Simulation for Extreme-Scale Network Systems Co-Design

By Misbah Mubarak
Advisor: Christopher D. Carothers
January 20, 2015

A high bandwidth, low latency interconnect network is a critical component in the design of future High Performance Computing (HPC) Systems. While a number of network topologies have been proposed for future HPC systems, the research community has turned to simulation to find a topology that yields high performance. Among the network topologies available for future HPC networks, one emerging class of networks are the low-diameter, low-latency topologies such as the dragonfly that use high-radix routers to yield high bisection bandwidth. Another candidate is the torus network topology that uses multi-dimensional network links to improve path diversity and exploit locality between nodes. Exploring the design space of these candidate interconnects by using simulation, before building real HPC systems, is critical.

In the first part of the thesis, we present a methodology for the modeling and simulation of very large-scale dragonfly and torus network topologies at a detailed fidelity using the Rensselaer Optimistic Simulation System (ROSS). We evaluate various configurations of a million-node torus network in order to determine the effect of torus dimensionality on network performance using challenging HPC traffic patterns. We also explore a million-node dragonfly network model and investigate the implications of its configurations on network performance using ROSS. We then evaluate the performance of our simulations in order to demonstrate that we are able to efficiently execute large-scale network simulations on today's leadership class supercomputers such as the Blue Gene/Q systems at Argonne Leadership Computing Facility (ALCF) and RPI's Computational Center for Innovation (CCI). We show that our simulations can achieve an event rate of 1.33 billion events/second with a total of 872 billion committed events on the AMOS Blue Gene/Q system. We validate the accuracy of our torus and dragonfly network models using empirical measurements from Blue Gene super computers and simulated results from the cycle accurate simulator `booksim' respectively.

MPI collective communication is a frequently used part of most large-scale scientific applications. In the second part of the thesis, we extend our torus and dragonfly network models to simulate MPI collective communication operations using the optimistic event scheduling capability of ROSS. We also demonstrate that both small- and large-scale dragonfly and torus collective models can execute efficiently on today's massively parallel architectures.

In the context of HPC system simulations, having an end-to-end simulation tool that can characterize the behavior of large-scale scientific applications on future HPC systems is highly beneficial. The last part of the thesis describes how we have introduced the dragonfly and torus network models as an `interconnect component' of the 'CO-Design of multi-layer Exascale Storage architectures (CODES)' storage and network system simulator, so that the CODES HPC models can make use of these high fidelity networks as their underlying interconnect backbone. Additionally, in order to effectively evaluate the behavior of scientific applications on simulated HPC networks, we have introduced a workload generator component in CODES that uses real scientific application workloads from leadership class HPC systems and uses them as a basis to run the dragonfly and torus network simulations. We also present the performance results of the torus and dragonfly network models using real application's network traces through the CODES network workload generator at a modest scale.

Return to main PhD Theses page