Honours/MIT project proposal:
The Effect of Architectural Variations on the NAS Parallel Benchmarks

supervisor: Dr Peter Strazdins and other members of the Sparc-Sulima team.

Simulation is playing an increasingly important role in the design and evaluation of current and future high performance computer systems. Sparc-Sulima is an UltraSPARC SMP full-machine simulator currently being developed by the CC-NUMA Project. SMPs (Symmetric Multiprocessors, or shared memory multiprocessors) are widely used as server machines for demanding scientific and commercial uses; in most situations, the performance (or lack of) the shared memory system is of key interest. In particular, the memory system is NUMA (Non Uniform Memory Access); that is, the time for a processor to access a piece of data depends on where that data is situated in the memory system. The CC-NUMA Project has a 12-CPU UltraSPARC III called alcatraz.

The NAS Parallel Benchmarks (NPB) are a widely used set of benchmarks derived from scientific applications (largely related to computational fluid dynamics). For SMP, the OMP (Open Multi-Processing version of the NPB are of chief interest, as OMP is emerging as a high-level and highly portable programming paradigm for shared memory multiprocessors.

Currently, Sparc-Sulima can accurately simulate UltraSPARC III Cu system (CPU, caches and memory system (backplane)), and has facilities to count events of interest with respect to program performance (e.g. the number of 2nd-level cache misses). Very recently, some preliminary work on analysing the NPB on alcatraz has been performed by vacation student Yan Zhang; this is of course for the current (actual) UltraSPARC III Cu system. Of greater interest to possible future system design would be to explore architectural variants, and evaluate their effect on performance. For example, what effects would doubling the 2nd-level cache size have on the performance of these benchmarks?

The aim of this project is to propose and quantitatively answer such questions, with the ultimate purpose of identifying the architectural features that are most critical to the performance of these benchmarks. These features include cache configurations, cache coherency protocols, and the configuration of the memory system `backplane' (particularly those which create NUMA effects). This project is related to the project Extended Threads Emulation in an SMP Computer Simulator. It is part of the CC-NUMA Project.

References

Last Modified: Peter Strazdins, 20 Jan 2005