Warp-aware trace scheduling for GPUs
2014pp. 163–174
Citations Over TimeTop 10% of 2014 papers
Abstract
GPU performance depends not only on thread/warp level parallelism (TLP) but also on instruction-level parallelism (ILP). It is not enough to schedule instructions within basic blocks, it is also necessary to exploit opportunities for ILP optimization beyond branch boundaries. Unfortunately, modern GPUs cannot dynamically carry out such optimizations because they lack hardware branch prediction and cannot speculatively execute instructions beyond a branch.
Related Papers
- → In search of speculative thread-level parallelism(1999)173 cited
- → The Impact of Speculative Execution on SMT Processors(2007)6 cited
- → Evaluating the effects of branch prediction accuracy on the performance of SMT architectures(2002)8 cited
- → A Study of Errant Pipeline Flushes Caused by Value Misspeculation(2004)
- A Hybrid Value Predictor using Speculative Update in Superscalar Processors(2001)