Microbenchmarks for determining branch predictor organization

Software Practice and Experience2004Vol. 34(5), pp. 465–487

Citations Over TimeTop 15% of 2004 papers

Milena Milenković, Aleksandar Milenković, Jeffrey Kulick

Abstract

Abstract In order to achieve an optimum performance of a given application on a given computer platform, a program developer or compiler must be aware of computer architecture parameters, including those related to branch predictors. Although dynamic branch predictors are designed with the aim of automatically adapting to changes in branch behavior during program execution, code optimizations based on the information about predictor structure can greatly increase overall program performance. Yet, exact predictor implementations are seldom made public, even though processor manuals provide valuable optimization tips. This paper presents an experimental flow with a series of microbenchmarks that determine the organization and size of a branch predictor using on‐chip performance monitoring registers. Such knowledge can be used either for manual code optimization or for design of new, more architecture‐aware compilers. Three examples illustrate how insight into exact branch predictor organization can be directly applied to code optimization. The proposed experimental flow is illustrated with microbenchmarks tuned for Intel Pentium III and Pentium 4 processors, although they can easily be adapted for other architectures. The described approach can also be used during processor design for performance evaluation of various branch predictor organizations and for testing and validation during implementation. Copyright © 2004 John Wiley & Sons, Ltd.

Citations Over TimeTop 15% of 2004 papers

Abstract

Related Papers