Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor
Citations Over TimeTop 1% of 2006 papers
Abstract
Data prefetching via helper threading has been extensively investigated on simultaneous multi-threading (SMT) or virtual multi-threading (VMT) architectures. Although reportedly large cache latency can be hidden by helper threads at runtime, most techniques rely on hardware support to reduce context switch overhead between the main thread and helper thread as well as rely on static profile feedback to construct the help thread code. This paper develops a new solution by exploiting helper threaded prefetching through dynamic optimization on the latest UltraSPARC chip-multiprocessing (CMP) processor. Our experiments show that by utilizing the otherwise idle processor core, a single user-level helper thread is sufficient to improve the runtime performance of the main thread without triggering multiple thread slices. Moreover, since the multiple cores are physically decoupled in the CMP, contention introduced by helper threading is minimal. This paper also discusses several key technical challenges of building a lightweight dynamic optimization/software scouting system on the UltraSPARC/Solaris platform.
Related Papers
- → Hardware implementation of context switching for hard real-time operating systems(2011)11 cited
- Parallel performance model for network processors and the characteristics of multithread stalls(2007)
- Database search based on multi-threading technique(2005)
- → Multithreading CPU and Multicore CPU Design in Verilog HDL(2015)
- → Threading(2007)