It has been released to the Fujitsu Parallel Computing Research Facilities . (FPCRF):

- First Edition: May 1993 (scattered or 1 x 1 block-cyclic decomposition, single and double precision matrices, AP1000 only)
- Second Edition: January 1994 (update on the First Edition)
- Third Edition: May 1997 (block-cyclic matrix distribution, double precision matrices; portable).

There are several papers related to the DBLAS available.

For the Third Edition, documentation includes Guide To Function (11 pages, 67 KB) and a User's Guide (14 pages, 72 KB). This edition has the following features:

- general parallel algorithms for block-cyclic distributed
matrix distribution, which are efficient for all ranges of
matrix sizes, minimizing communication startup overhead and
volume, while maximizing load balance and cell computation
performance.

The exceptions are that there are some block alignment requirements, and the symmetric matrix routines are not optimal for when the size of the symmetric matrix is much smaller than that of the other matrix operands. - a C language interface, and also a Fortran language interface, which is identical to that of the ScaLAPACK Parallel BLAS ( PBLAS )
- can be compiled to interface to the BLAS and BLACS (Basic Linear Algebra Communication Subroutines) libraries, enabling full portability.
- relatively small code size and software overheads.

A draft paper (last updated Feb 17 1998 - includes LLT and AP+ results) for the performance comparison of the DBLAS with 1996 and 1997 versions of UTK PBLAS can be found here (9 pages, 86 KB). It is also available in slides form (10 pages, 112 KB)

Any questions or problems should be directed to Peter.Strazdins@cs.anu.edu.au. It would also be appreciated that any results or feedback of (successful) installations of the DBLAS be reported to the same address.