menu

C/C++ Getting Started

5 Labs · 41 Credits · 5h 49m

Languages Badge nvidia cpp getting started

To get a basic understanding the main approaches to GPU Compute programming using C/C++

Introduction to Accelerated Computing

Learn about the three techniques for accelerating code on a GPU; Libraries, Directives like OpenACC, and writing code directly in CUDA-enabled langauges. In 45 minutes, you will work through a few different exercises demonstrating the potential speed-ups and ease of use of porting to the GPU.

Icon  intro Introductory Free 45 Minutes

Accelerating Applications with GPU-Accelerated Libraries in C/C++

Learn how to accelerate your C/C++ application using drop-in libraries to harness the massively parallel power of NVIDIA GPUs. In about two hours, you will work through three exercises, including:

  • Use cuBLAS to accelerate a basic matrix multiply
  • Combine libraries by adding some cuRAND API calls to the previous cuBLAS calls
  • Use nvprof to profile code and optimize with some CUDA Runtime API calls

Icon  advanced Advanced 10 Credits 1 Hour

OpenACC - 2X in 4 Steps

Learn how to accelerate your C/C++ or Fortran application using OpenACC to harness the massively parallel power of NVIDIA GPUs. OpenACC is a directive based approach to computing where you provide compiler hints to accelerate your code, instead of writing the accelerator code yourself. In 90 minutes, you will experience a four-step process for accelerating applications using OpenACC:

  1. Characterize and profile your application
  2. Add compute directives
  3. Add directives to optimize data movement
  4. Optimize your application using kernel scheduling

Icon  level Beginner 1 Credit 1 Hour 30 Minutes

Accelerating Applications with CUDA C/C++

Updated with more content and support for CUDA 6!

Learn how to accelerate your C/C++ application using CUDA to harness the massively parallel power of NVIDIA GPUs. In 90 minutes, you will work through seven exercises, including:

  • Hello Parallelism!
  • Accelerate the simple SAXPY algorithm
  • Accelerate a basic Matrix Multiply algorithm with CUDA
  • Error checking GPU code
  • Querying GPU Devices for capabilities
  • Data management with Unified Memory
  • A case study implementing most of the above

Icon  expert Expert 15 Credits 1 Hour 30 Minutes

Using Thrust to Accelerate C++

Thrust is a parallel algorithms library loosely based on the C++ Standard Template Library. Thrust provides a number of building blocks, such as sort, scans, transforms, and reductions, to enable developers to quickly embrace the power of parallel computing. In addition to targeting the massive parallelism of NVIDIA GPUs, Thrust supports multiple system back-ends such as OpenMP and Intel’s Threading Building Blocks. This means that it’s possible to compile your code for different parallel processors with a simple flick of a compiler switch.
In 90-minutes, you will work through a number of exercises including:

  1. Basic Iterators, Containers, and Functions
  2. Built-in and Custom Functors
  3. Fancy Iterators
  4. Portability to CPU processing
  5. Exception and Error handling
  6. A case study implementing all of the above

Icon  expert Expert 15 Credits 1 Hour 30 Minutes