Multi-resource packing for cluster schedulers
Citations Over TimeTop 1% of 2014 papers
Abstract
Tasks in modern data parallel clusters have highly diverse resource requirements, along CPU, memory, disk and network. Any of these resources may become bottlenecks and hence, the likelihood of wasting resources due to fragmentation is now larger. Today's schedulers do not explicitly reduce fragmentation. Worse, since they only allocate cores and memory, the resources that they ignore (disk and network) can be over-allocated leading to interference, failures and hogging of cores or memory that could have been used by other tasks. We present Tetris, a cluster scheduler that packs, i.e., matches multi-resource task requirements with resource availabilities of machines so as to increase cluster efficiency (makespan). Further, Tetris uses an analog of shortest-running-time-first to trade-off cluster efficiency for speeding up individual jobs. Tetris' packing heuristics seamlessly work alongside a large class of fairness policies. Trace-driven simulations and deployment of our prototype on a 250 node cluster shows median gains of 30% in job completion time while achieving nearly perfect fairness.
Related Papers
- → MILP APPROACH FOR THE DESIGN OF VERTICAL VAPOR-LIQUID SEPARATION VESSELS- COMPARISON WITH HEURISTICS(2020)1 cited
- ADD – heuristics' starting procedures for capacitated plant location models(1985)
- Choosing a Good Toolkit: An Essay in Behavioral Economics(2014)
- → The Framework(2011)