Achieving Optimum Performance for
Dense Linear Algebra Computations on Parallel Computers
This project is to extend the extensive parallel dense linear
algebra library
ScaLAPACK using algorithmic blocking (also called the
`distributed panels' technique ),
for increased performance and
simplicity of user interface.
An outline
of the project (4 pages, 47KB) is available
This project was begun in mid 1995 and will continue over 1996-7.
Its Chief Investigator is Peter Strazdins.
Other personnel
includes Visiting Fellow Hari Koesmarno (Jul 96 to Feb 97).
An intermediate progress report ,
an `End of Grant Report' (Dec 1997)
and a `Final Report' (Dec 1999)
are available.
An example of extended ScaLAPACK code is for the
LU decomposition driver PDGETRF()
which in turn requires a modified level 2 factorization routine
PDGETF2() .
Similarly for LLT factorization,
PDPOTRF() and
PDPOTF2() (note: handling of
non +ve definiteness has been changed from original codes).
Related ScaLAPACK papers include the
LAPACK Working Notes 80, 90. 94, 95, 96, 100.
Acknowledgements:
this project has received funding from Small ARC Grant F96042,
Department of Computer Science at ANU and
the ANU-Fujitsu CAP Project .