CUDA 4.0 Library Performance Overview

The performance of many math functions has improved with the release of the CUDA 4.0 Toolkit.

This  presentation includes the performance results of many of the key functions.

Results include performance measurements for :

  • cuFFT – Fast Fourier Transforms Library
  • cuBLAS – Complete BLAS Library
  • cuSPARSE – Sparse Matrix Library
  • cuRAND – Random Number Generation (RNG) Library
  • NPP – Performance Primitives for Image & Video Processing
  • Thrust – Templated Parallel Algorithms & Data Structures
  • math.h - C99 floating-point Library