Achieving Optimum Performance for Dense Linear Algebra Computations on Parallel Computers

This project is to extend the extensive parallel dense linear algebra library ScaLAPACK using algorithmic blocking (also called the `distributed panels' technique ), for increased performance and simplicity of user interface. An outline of the project (4 pages, 47KB) is available

This project was begun in mid 1995 and will continue over 1996-7. Its Chief Investigator is Peter Strazdins. Other personnel includes Visiting Fellow Hari Koesmarno (Jul 96 to Feb 97). An intermediate progress report , an `End of Grant Report' (Dec 1997) and a `Final Report' (Dec 1999) are available.

An example of extended ScaLAPACK code is for the LU decomposition driver PDGETRF() which in turn requires a modified level 2 factorization routine PDGETF2() . Similarly for LLT factorization, PDPOTRF() and PDPOTF2() (note: handling of non +ve definiteness has been changed from original codes).

Related ScaLAPACK papers include the LAPACK Working Notes 80, 90. 94, 95, 96, 100.

Acknowledgements: this project has received funding from Small ARC Grant F96042, Department of Computer Science at ANU and the ANU-Fujitsu CAP Project .