Microbenchmarks for determining branch predictor organization
Citations Over TimeTop 15% of 2004 papers
Abstract
Abstract In order to achieve an optimum performance of a given application on a given computer platform, a program developer or compiler must be aware of computer architecture parameters, including those related to branch predictors. Although dynamic branch predictors are designed with the aim of automatically adapting to changes in branch behavior during program execution, code optimizations based on the information about predictor structure can greatly increase overall program performance. Yet, exact predictor implementations are seldom made public, even though processor manuals provide valuable optimization tips. This paper presents an experimental flow with a series of microbenchmarks that determine the organization and size of a branch predictor using on‐chip performance monitoring registers. Such knowledge can be used either for manual code optimization or for design of new, more architecture‐aware compilers. Three examples illustrate how insight into exact branch predictor organization can be directly applied to code optimization. The proposed experimental flow is illustrated with microbenchmarks tuned for Intel Pentium III and Pentium 4 processors, although they can easily be adapted for other architectures. The described approach can also be used during processor design for performance evaluation of various branch predictor organizations and for testing and validation during implementation. Copyright © 2004 John Wiley & Sons, Ltd.
Related Papers
- Computer Architecture: A Quantitative Approach(1989)
- → Cache Attacks and Countermeasures: The Case of AES(2005)1,331 cited
- → Predicting Secret Keys Via Branch Prediction(2006)300 cited
- → Compiler synthesized dynamic branch prediction(1996)39 cited
- → Improving static branch prediction in a compiler(1998)22 cited