Response #1: 2005-08-23
1. Is dense linear algebra a performance bottleneck in your applications?
2. How often do your applications use the arithmetic precisions listed below:
|a. Single precision: || Rarely|
|b. Double precision: || Very Frequently|
|c. More than double precision: || Very Frequently|
|d. Complex single precision: || Never|
|e. Complex double precision: || Frequently|
|f. Complex, more than double precision: || Frequently|
3. What dense matrix sizes are most important or time-consuming for your application?
100s X 100s
4. Does your application come close to, or run out of memory on important problems?
5. Number of processors used for your application:
|a. SMP: ||Less than 10|
|b. Distributed shared-memory: ||Less than 10|
|c. Distributed memory: ||Less than 10|
6. Which architectures do you use or intend to use in the next three years?
Sequential, Multi-core-thread, Symmetric-multi-procs
7. Do you use any other sequential or parallel dense linear algebra packages other than LAPACK or ScaLAPACK?
ATLAS, in-house modular LA packages for rational (not float) problems
8. Please rank how the following features would be useful to your current or planned applications?
|a. User defined matrix types: ||Very useful|
|b. Using optional arguments in the language interface: ||Somewhat useful|
|c. Automatic memory allocation of the work space: ||Somewhat useful|
|d. More complicated matrix data structures: ||Very useful|
9. Do your applications solve linear algebra problems of the type?
Linear positive definite systems, Banded linear systems, General linear systems, Generalized eigenvalue, Symmetric eigenvalue, Non-symmetric eigenvalue, Least-square problems, SVDs
1. Do you use LAPACK (or a vendor version of LAPACK )?
4. If you use LAPACK, do you use a vendors version or one obtained directly from Netlib?
6. Do your applications make direct LAPACK calls?
7. Do your applications use libraries which depend on LAPACK?
8. Do your applications use a higher-level interface to LAPACK?
9. If you answered yes above, which higher-level interfaces do you use?
in-house custom written
10. Is the LAPACK procedure interface a barrier to more extensive use?
11. From which languages do you call LAPACK routines?
13. How could the LAPACK interface be improved to feel more natural to your application and implementation language?
It would be slightly nicer to have actual routines to query accuracy of eigenproblem results as error bounds from the routines such as zgeevx (that calculate eigen-condition numbers). Right now, the LUG shows (eg. lug/node100.html) code fragments to obtain such accuracy results based on working precision details and the computed condition numbers. It'd be slightly nicer to have these code fragments be actually implemented in some routine -- the driver, say, or other.
14. If you have installed LAPACK yourself, how could the installation process be improved?
Export .exp files for use with MSVC++ on Windows would be slightly nice to have, though this is a minor issue.
15. How frequently do you refer to the LAPACK Users Guide?
16. What information in the LAPACK guide is hard to find or is missing, if any?
It's not clear from the section on storage schemes how non-square packed storage works (the examples are square, only). One has to figure out how it might work, eg for non-square packed band storage.
The use of m,n,k,etc in the few topmost SVD routines is slightly confusing. It's not immediately clear how to implement it to produce a "non-full-span" U (for U*S*V^T=A say) so as to be most efficient when solving least-squares with m>>n .
In the absence of infomation in the docs about whether the QR implementations (with or without column pivoting) are rank-revealing it becomes more prudent to always have to do SVD for least-squares problems with might be rank-deficient. It might be useful if the docs said more on this. But that's asking for more mathematical education in the docs, which I realize is a big request.
Links in the LUG on netlib from the instances of LAPACK function names to their nearby source locations on netlib would be very useful, since the specification and calling sequence infomation is in comments in the individual routines' source files.
NB.The LUG section Specifications of Routines (lug/node149.html), as it appears on netlib at least, is empty except for a brief note. Thus the Individual Routines sources' comments appear to be the specs.
Details of the encoding of the results of the Bunch-Kaufman-Parlett decomposition of symmetric indefinite matrices seems missing in the guide. This makes it near difficult to extract and form the individual factors computed by, say, dsytrf. I realize that this request is inefficient and unwise and unnecessary when solving systems, etc, but sometimes users just really want to get their hands on the explicit matrix factors and are willing to lose the efficient encoding which the dsycon, dsytrs, etc understand. Using the details as supplied in the comments in, say, dsytrf's source, is involved.
The approximation of condition numbers (Higham's modification of Hager's method) can be inaccurate. The documentation isn't very clear on how inaccurate it might be.
1. Do you use ScaLAPACK (or a vendor version of ScaLAPACK )?
2. If you do not use ScaLAPACK, why?
Cost of learning
15. How frequently do you refer to the ScaLAPACK Users Guide?
Targeted Environment Specifics
1. Under which operating system environments do your applications run?
AIX, HP/UX, IRIX, Linux, Mac OS X, Solaris, Tru64, Windows (cygwin), Windows (other), Linux on all of (x86-64 x86-32 IA-64)
2. If your applications run in a shared-memory environment, which styles of parallelism do they employ?
|LAPACK is excellent software.
We also use a multiprecision build of CLAPACK, built in C++ using customized class for 'double' say, overloaded arithmetic operators, and runtime-determined substitutes for LAPACK machine-constants routines. Building this requires some "clean up" of the CLAPACK sources, which makes getting updates/fixes more involved. It would be great if this could be done more easily, without any need for minor editing, eg. use of temp vars and no calls involving explicit float args like foo(...,1.0,...) or comparisons like bar > 1.0 . The ability for anyone to be able to build easily a quad- or multiprecision (GnuMP based, say) version of LAPACK might be generally well received. It's not immediately clear that all LAPACK routines (eg, SVD) will remain robust (eg. convergence) when treated in this way. Of course, multiprecision LAPACK brings with it many involved questions about how to best leverage double-precision solutions as initial candidate solutions for arbitrary high precision computation.
|7. Use DOE-lab resources||No|