A Performance and Energy Consumption Analytical Model for GPU
Citations Over TimeTop 15% of 2011 papers
Abstract
Even with a powerful hardware in parallel execution, it is still difficult to improve the application performance and reduce energy consumption without realizing the performance bottlenecks of parallel programs on GPU architectures. To help programmers have a better insight into the performance and energy-saving bottleneck of parallel applications on GPU architectures, we propose two models: an execution time prediction model and an energy consumption prediction model. The execution time prediction model(ETPM) can estimate the execution time of massively parallel programs which take the instruction-level and thread-level parallelism into consideration. ETPM contains two components: memory sub-model and computation sub-model. The memory sub-model is estimating the cost of memory instructions by considering the number of active threads and GPU memory bandwidth. Correspondingly, the computation sub-model is estimating the cost of computation instructions by considering the number of active threads and the application's arithmetic intensity. We use ocelot to analysis PTX codes to obtain several input parameters for the two sub-models such as the memory transaction number and data size. Basing on the two sub-models, the analytical model can estimates the cost of each instruction while considering instruction-level and thread-level parallelism, thereby estimating the overall execution time of an application. The energy consumption prediction model(ECPM) can estimate the total energy consumption basing on the data from ETPM. We compare the outcome from the models and the actual execution on GTX260 and Tesla C2050. The results show that the models can reach almost 90 percentage accuracy in average for the benchmarks we used.
Related Papers
- → Operator-data type pair based execution environments independent worst-case execution time measuring method(2016)1 cited
- → Massively Parallel Computing And TheBoundary Element Method(1970)3 cited
- → Empirical study of parallelism throttling schemes on a massively parallel system(2002)
- ABC: A Blocked C/C++ Parallel Programming Model(1995)
- → An Efficient Execution Framework of Two-Part Execution Scenario Analysis(2021)