Optimisations for the memory hierarchy
of a Singular Value Decomposition Algorithm
implemented on the MIMD Architecture
A. Czezowski and P.E. Strazdins.
Optimisations for the memory hierarchy of a Singular Value
Decomposition Algorithm implemented on the MIMD Architecture.
International Conference on High-Performance Computing and Networking,
Munich, April 1994.
Contents
Abstract
The increasing popularity of Singular Value Decomposition
Algorithms, used in real time signal processing, demands a rapid
development of their fast and reliable implementations. This paper shows
several modifications to the Jacobi-like parallel algorithm for Singular Value
Decomposition (SVD) and their impact on the algorithm's performance. In
particular, the optimisations for the parallel memory hierarchy (register,
cache,
main memory and external processor memory levels) can dramatically increase the
performance of the Hestenes SVD algorithm. The central principle in all of
the optimisations presented herein is to increase the number of columns
(column segments) being held in each level of the memory hierarchy. The
algorithm was implemented on the Fujitsu's AP1000 Array Multiprocessor, but all
optimisations described can be easily applied to any MIMD architecture
with a mesh or hypercube topology, and all but one can be applied to
register-cache uniprocessors also.