Argo: Architecture-aware graph partitioning
Citations Over TimeTop 18% of 2016 papers
Abstract
The increasing popularity and ubiquity of various large graph datasets has caused renewed interest for graph partitioning. Existing graph partitioners either scale poorly against large graphs or disregard the impact of the underlying hardware topology. A few solutions have shown that the nonuniform network communication costs may affect the performance greatly. However, none of them considers the impact of resource contention on the memory subsystems (e.g., LLC and Memory Controller) of modern multicore clusters. They all neglect the fact that the bandwidth of modern high-speed networks (e.g., Infiniband) has become comparable to that of the memory subsystems. In this paper, we provide an in-depth analysis, both theoretically and experimentally, on the contention issue for distributed workloads. We found that the slowdown caused by the contention can be as high as 11x. We then design an architecture-aware graph partitioner, Argo, to allow the full use of all cores of multicore machines without suffering from either the contention or the communication heterogeneity issue. Our experimental study showed (1) the effectiveness of Argo, achieving up to 12x speedups on three classic workloads: Breadth First Search, Single Source Shortest Path, and PageRank; and (2) the scalability of Argo in terms of both graph size and the number of partitions on two billion-edge real-world graphs.
Related Papers
- → Impact of Argo on analyses of the global ocean(2007)89 cited
- Modeling three-dimensional acoustic field in the ocean by using Argo and EOF(2012)
- The Global Argo System Construction and its Application(2007)
- Analysis of the Global Argo Project Implementation(2008)