Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and the DPLASMA and StarPU Scheduler, by:
Stanimire Tomov, George Bosilca, and Cédric Augonnet
Learn how to develop numerical software for heterogeneous architectures of Multicore and GPUs through a hybridization methodology that is built on:
- Representing algorithms as collections of tasks and data dependencies, and
- Properly scheduling the tasks' execution over the available multicore and GPU hardware components.
Examples will be given from the Matrix Algebra on GPU and Multicore
) project, which aims to develop a new generation of linear algebra libraries that extends the sequential LAPACK-style algorithms for the highly parallel GPU and multicore heterogeneous architectures. As MAGMA has stand-alone hybrid algorithms, it also provides hybrid kernels to be used as building blocks in tile and "communication-avoiding" algorithms that must be efficiently scheduled. You will learn how to use dynamic schedulers to easily express these new algorithms, while at the same time fully use and extract high-performance from heterogeneous systems of multicore and GPUs. In particular, we will consider the DPLASMA and StarPU
DPLASMA is related to the Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA
) project but extends its operation to the distributed memory regime,
while StarPU is a runtime system that is specialized into scheduling tasks onto accelerator-based platforms.