OpenPOWER GPU-enabled architecture performance enhancement using the Engineering and Scientific Subroutine Library (ESSL) drop-in acceleration
This article illustrates the methodology to offload the part of computations to GPU without refactoring the applications. The Crossroads/NERSC-9 Memory Bandwidth benchmark is used to showcase the offload of dense matrix multiplication (DGEMM) computations on GPU by linking (compile time) the newer version of CUDA-enabled ESSL (IBM Scientific Library). The use of CUDA-enabled ESSL gives approximately six times performance gain over CPU-only code.
|
|
Full Story |
This topic does not have any threads posted yet!
You cannot post until you login.